|
Message-ID: <20130616155723.GA24652@brightrain.aerifal.cx> Date: Sun, 16 Jun 2013 11:57:23 -0400 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Improving AIO implementation The current AIO implementation in musl has some bugs (as Jens has recently noticed) and limitations, and despite AIO being a rather ugly and rarely-used set of interfaces, I think we should aim a bit higher in quality still. The main issues I'm aware of are: 1. AIO is not synchronized with close. This bug is also present in glibc; see http://sourceware.org/bugzilla/show_bug.cgi?id=14942 It is also very hard to fix, since close is required to be async-signal-safe, but protecting AIO against its file descriptors being closed and reopened is difficult to make async-signal-safe. unshare(CLONE_FILES) seems to offer an approach to a solution, but it's complicated by these issues: A. The application-visible thread, if SIGEV_THREAD is used, would still have to share file descriptors with the rest of the process, so two threads would be needed for SIGEV_THREAD delivery rather than one. B. The IO thread would have to find an inexpensive way of closing all other file descriptors after unsharing, so as not to keep files (mainly pipes, sockets, etc.) open after the application expected them to be closed. However, this could interfere with fcntl locks, unless unshare(CLONE_FILES) creates a new lock ownership context too. Another possible solution is to dup the file descriptor for AIO, but that also introduces an issue with fcntl locks: when the duplicate fd is closed, locks would be lost. 2. There is no ordering between AIO operations on a given file descriptor. Each AIO request is treated completely independently. Based on my reading of XSH 2.8.2 Asynchronous I/O, the current behavior seems at least borderline permissible (as long as we specify the implementation-defined circumstances as being "at all times"), but it's low-quality. 3. The aio_cancel function is not able to determine the correct value if its aiocb pointer argument is a null pointer. This is because there is no index of outstanding AIO operations on a given file descriptor. The concept of such an index is even difficult with respect to the close semantics in issue #1 above, and any solution based on unshare would not help with implementing such indexing. On the other hand, some things about musl's AIO are more-correct than glibc's. For instance, aio_suspend is required to be async-signal-safe, but glibc makes not effort to satisfy this requirement (http://sourceware.org/bugzilla/show_bug.cgi?id=13172). I cheated in musl by using an very inefficient form of waking for aio_suspend, which is also something of a QoI issue. Any thoughts on a direction for improving AIO? Based on the above issues, I think we need to move to some model indexed by file descriptor where close actually has to do the difficult work (optimized-out in static linking via a weak symbol) of cancelling pending AIO. Ordering of writes should be preserved except when they are non-overlapping (i.e. a new write can start immediately except when it overlaps with a pending AIO operation), and reads should be unordered (immediately runnable) except that they are queued for the fd when they overlap with an unfinished write. As for implementing async-signal-safety, any AIO operations that access or modify the index can block all signals to prevent close from being called from a signal handler in the same thread while they are in-progress. However, unconditionally blocking signals in close to prevent AIO functions from running in a signal handler while close is working with the AIO index would be unjustifiably costly. Instead I would propose an atomic global counter of file descriptors in the AIO index. Only if this count is nonzero would close need to block signals and check the index. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.