|
Message-ID: <20181130160951.GS23599@brightrain.aerifal.cx> Date: Fri, 30 Nov 2018 11:09:51 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: stdio glitch & questions On Fri, Nov 30, 2018 at 09:51:39PM +1100, Xan Phung wrote: > Hi, > > A few questions about stdio: > > (1) I notice __toread.c uses angular quotes for <stdio_impl.h> whereas all > other source files use "stdio_impl.h". I assume the latter is correct and > __toread.c's use of angular quotes was a glitch & it should really be > double quotes... is that correct? Yes, this doesn't make any difference but it's a style mistake. > (2) I notice vfprintf first tries to call printf_core with f=0 (line 667) > then calls printf_core again with f set to the actual file to receive > output (line 682). Why is printf_core called twice? I struggle to > understand the purpose of the first call with f=0. To understand this you need to look inside printf_core. When called with !f, it attempts to collect the %N$-form arguments if they're used, or bails out early if it detects that normal % arguments are used. Two passes are needed here because random access to a va_list is not possible. > (3) When I do a step thru the __fwritex function to understand how printf > works, I notice the resulting writev system calls pass on the output data > as a two element iovec array, with the 1st element comprising all line > buffered text up to & including the last variable data item, and then the > 2nd element comprising the residual format string trailing the last > variable data item (more often than not just a single '\n'). > > For example, printf("error: %s\n", msg) would put all text up to & > including %s text in first iovec and the second iovec contains only '\n'. > I understand the rationale of this is to avoid copying the final '\n' to > the buffer at f->wpos. (There is actually guaranteed space in the buffer > itself due to a check at line 10 of fwrite.c). The use the array of 2x > iovec's presumably then relies on Linux kernel scatter-gather I/O to then > optimally handle the iovec array, ie: that the writev() of 2x iovec is more > efficient than avoiding the copy of a few additional bytes (often a single > '\n' byte) into f->wpos, and then using a single write() syscall. Indeed, in the case where the new data is very short, it's almost certainly faster to just copy it to the buffer and perform a single write syscall. Likewise, for reading a single character it's almost surely faster to perform a single read syscall then pull it out of the buffer. However, conversely, it's possible to see a call to the stdio write backend (f->write) where the new data is too large to fit in the buffer. In this case, a writev syscall is almost certainly faster (fewer trips back and forth between user and kernel space, which are the dominant cost), and moving data into the buffer is not helpful because it can't reduce the number of syscalls. Prior to commit e3cd6c5c265cd481db6e0c5b529855d99f0bda30, fwrite contained heuristic logic for individual cases, but it couldn't necessarily be optimal under all usage patterns. After the change, the number of syscalls is always minimized. > Isn't this a big assumption? With Linux itself, can we really know that > Linux device drivers are smart enough to do writev() optimally? Also, > there is a lot of interest in porting musl to non-Linux os's, many of which > do not have writev(). (I am porting musl to WebAssembly and to Plan 9). > > I can prepare a patch of a version using write() instead of writev() if > there is interest in this... You can emulate readv and writev using the property that short reads and writes are permissible, copying data through a fixed-size intermediate buffer on the stack. This is of course suboptimal but easy to do. Emulation of readv is really sensitive, because breaking it up into multiple reads can cause inapprpriate blocking. Linux actually has a bug where this can happen anyway -- see commit 2cff36a84f268c09f4c9dc5a1340652c8e298dc0 -- so musl's __stdio_read already reads the last character of the first (caller-requested) part through the buffer, and collapses the readv to a read if this makes the first iov empty. It would probably be welcome to make __stdio_write make use of SYS_write when it would be expected to be faster (len very small), but I'm not sure what the exact cutoff should be. Switching away from writev/readv would not be a welcome change though; use of them is very intentional and it's how musl avoids some pathological slowness under certain stdio usage patterns. If you're porting to a system that lacks the underlying syscalls, I think it probably makes sense to emulate them at the syscall() level using a strategy like I described above. It's necessary for making the public readv()/writev() functions work anyway. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.