|
Message-ID: <5489BADF.8040604@skarnet.org> Date: Thu, 11 Dec 2014 16:40:15 +0100 From: Laurent Bercot <ska-dietlibc@...rnet.org> To: musl@...ts.openwall.com Subject: Re: possible getopt stderr output changes On 11/12/2014 07:44, Rich Felker wrote: > Is there a reason behind this? On my build, the printf core is ~6.5k > and the other parts of stdio you might be likely to pull in are under > 2k. I'm happy to take your opinion into consideration but it would be > nice to have some rationale. 6.5k, or even 8.5k, is not much in the grand scale of things, but it's about the ratio of useful pulled in code / total pulled in code, which I like to be as close as possible to 1. And stdio tanks that ratio, see below. The modest size of the printf code is a testimony to the efficiency of the musl implementation, not to the sanity of the interface. > Personally I find stdio a lot more reasonable than getopt. I dislike stdio for several reasons: - The formatting engine is certainly convenient, but it is basically a runtime interpreter, which has to be entirely pulled in as soon as there's a format string, no matter how simple the formatting is. (Unless compilers perform specific static analysis on format strings to know which part of the interpreter they have to pull, but I doubt this is the case; gcc magically replaces printf(x) with puts(x) when x is devoid of format operations, and it is ugly enough as is.) That means I have to pull in the formatting code for floating point numbers, even if I only handle integers and strings; I have to pull in the code for edge cases of the specification, including the infamous "%n$" format, even if I never need it; I have to pull in varargs even if I only do very regular things with a fixed number of arguments. Most of the time I just want to print a string, a character, or an integer: being able to do this shouldn't add more than 2k to my executable, at most. - The FILE interface is not by any mesure suited to reliable I/O. When printf fails, there's no way to know how many bytes have been written to the descriptor. Same with fclose: if it fails, and the buffer was not empty, there's no way to know if everything was written. Having the same structure for buffered (stdout) and unbuffered (stderr) output is unnecessarily confusing; and don't get me started on buffered input, the details of which users have exactly zero control over. FILE is totally unusable for asynchronous I/O, which is 99% of what I do; it's just good enough to write error messages to stderr, where you don't need accurate reporting - in which case you can even do without stdio because stderr is unbuffered anyway. stdio, like a lot of today's standards, is only there because it's historical, and interface designers didn't know better at the time. It being a widely used and established standard doesn't mean that it's a good standard, by far. > [getopt] > has ugly global state, including possibly hidden internal state with > no standard way to reset it. It works well enough for most things > (because you can pretend the global state is a sort of main-local > state), but it's a problem if you want to handle multiple virtual > command lines in the same process I agree, it's ugly; but global state is a known problem and it's easy to fix. It's already been fixed for pwd/grp/netdb, for localtime, and a lot of other interfaces; it's only a matter of time before some kind of getopt_r() is standardized. > For proper reporting of errors with long options (note: currently this > is not done right), at least one component of the message, the option > name, has unbounded size, so there's no simple way to generate the > whole message in a buffer. Ah, long options. I have no idea how feasible it is to keep getopt and getopt_long as separated as possible, but I wouldn't mind at all if getopt_long (but not getopt) relied on stdio. Because programs using getopt_long are likely to already be using stdio anyway, and this is probably GNU so no one cares about code size. :) > So this doesn't sound like much > of a win over just doing the current multiple-write() approach. Since it mostly happens in the interactive case, avoiding multiple writes is essentially an artistic consideration. I was just interested in learning why you hadn't suggested manual buffering. -- Laurent
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.