musl - Re: <shadow.h> function: fgetspent

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190117153830.GN23599@brightrain.aerifal.cx>
Date: Thu, 17 Jan 2019 10:38:30 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: <shadow.h> function: fgetspent_r

On Thu, Jan 17, 2019 at 06:31:47AM +0100, Markus Wichmann wrote:
> On Wed, Jan 16, 2019 at 06:44:10PM -0500, Rich Felker wrote:
> > On Wed, Jan 16, 2019 at 03:38:06PM -0600, A. Wilcox wrote:
> > > What do you mean by "messy char[] buffer idiom"?  The buffer that is
> > > meant to contain the strings is passed to the function (char *buf),
> > > *not* returned by it.
> > 
> > It's also necessarily used to contain the struct itself, despite C not
> > really allowing this and the buffer not being properly aligned for it
> > (requiring realignment which is inherently nonportable and not even
> > possible in something like a memory-safe implementation). A proper API
> > would have the caller pass both a pointer to the struct to fill and a
> > pointer to char[] space for strings.
> > 
> 
> Erm, no, there is a parameter where you can store the struct itself.
> It's the second parameter. You are supposed to set the target of the
> fifth parameter equal to the second on success, else to NULL.

Oh, okay, I misread and was thinking it worked like the
gethostbyname_r etc. family.

> Working on it, but that is even more messy. What to do on EOF? The only
> thing I can think of is return 0, but set *spbufp to 0 (wait, is that
> the only reason return value and return pointer are split? Dear lord...)

I think so. That kind of mess is common in these interfaces.

> What to do on "buffer too small"? Seek the file back to where it was?

Indeed that sounds best at first but it depends on the stream being
seekable, making behavior inconsistent depending on whether it's used
with a seekable or nonseekable stream.

> What to return on format error?

No idea.

> What to do on "buffer too large"? I wanted to use fgets() to read the
> next line, but the size parameter of fgets() is an int, so the size
> parameter to fgetspent_r() can't exceed INT_MAX.

You can use fgets with large buffers but have to call it again if the
buffer is filled without hitting a newline. This of course is a pain.

However the right behavior is not to consider the error "buffer too
large" but rather "record too large". Records never need to exceed the
max username length plus the max plausibly-reasonable hash length plus
some space for the numeric fields. The size argument can then just be
clipped to something reasonable, and in this case hitting it is just
an invalid record.

FWIW the current implementation of getspnam_r has a bug in that it
silently ignores this issue if the caller passes a huge size. That
should be fixed too.

> > > I would personally leave the fp where it is (not rewind)
> > > since all the other *get*ent functions don't rewind on error either.
> > 
> > But should it finish consuming the line it's in the middle of?
> > Otherwise it could wrongly interpret the remainder of the line as a
> > complete line if called again. Note that the current version of the
> > non-_r function has that issue if getline OOM's..
> 
> Good point, that's a possible problem as well. We could rid ourselves of
> these problems by declaring a new call after error to be UB. But that's
> not QoI...

Note that getspnam_r already handles this correctly. After getting a
read with no final newline, it goes into a skip mode where it
continues to call fgets and discard input until a newline is hit.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.