|
Message-ID: <20220512140835.GJ7074@brightrain.aerifal.cx> Date: Thu, 12 May 2022 10:08:37 -0400 From: Rich Felker <dalias@...c.org> To: Alyssa Ross <hi@...ssa.is> Cc: musl@...ts.openwall.com Subject: Re: [PATCH musl v2 3/3] mntent: fix parsing lines with optional fields On Thu, Jan 13, 2022 at 06:53:19PM +0000, Alyssa Ross wrote: > Rich Felker <dalias@...c.org> writes: > > > On Thu, Jan 13, 2022 at 04:30:18PM +0000, Alyssa Ross wrote: > >> Hi Rich, thanks for following up on this. > >> > >> Rich Felker <dalias@...c.org> writes: > >> > >> > On Mon, Sep 20, 2021 at 12:21:41AM -0400, Rich Felker wrote: > >> >> On Wed, Sep 15, 2021 at 10:11:55PM +0000, Alyssa Ross wrote: > >> >> > According to fstab(5), the last two fields are optional, but this > >> >> > wasn't accepted by Musl. After this change, only the first field is > >> >> > required, which matches Glibc's behaviour. > >> >> > > >> >> > Using sscanf as before, it would have been impossible to differentiate > >> >> > between 0 fields and 4 fields, because sscanf would have returned 0 in > >> >> > both cases due to the use of assignment suppression and %n for the > >> >> > string fields (which is important to avoid copying any strings). So > >> >> > instead, before calling sscanf, initialize every string to the empty > >> >> > string, and then we can check which strings are empty afterwards to > >> >> > know how many fields were matched. > >> >> > --- > >> >> > > >> >> > We could also be stricter about it, and enforce that the first four > >> >> > fields are present, since the man page says only the last two are > >> >> > optional. Doing that would be a simple change of checking for the > >> >> > presence of mnt_opts instead of mnt_fsname at the end of my patch. > >> >> > > >> >> > v2: don't change n from int to size_t > >> >> > > >> >> > src/misc/mntent.c | 18 +++++++++++++----- > >> >> > 1 file changed, 13 insertions(+), 5 deletions(-) > >> >> > > >> >> > diff --git a/src/misc/mntent.c b/src/misc/mntent.c > >> >> > index eabb8200..238a0efd 100644 > >> >> > --- a/src/misc/mntent.c > >> >> > +++ b/src/misc/mntent.c > >> >> > @@ -21,7 +21,8 @@ int endmntent(FILE *f) > >> >> > > >> >> > struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int buflen) > >> >> > { > >> >> > - int cnt, n[8], use_internal = (linebuf == SENTINEL); > >> >> > + int n[8], use_internal = (linebuf == SENTINEL); > >> >> > + size_t len, i; > >> >> > > >> >> > mnt->mnt_freq = 0; > >> >> > mnt->mnt_passno = 0; > >> >> > @@ -39,10 +40,14 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle > >> >> > errno = ERANGE; > >> >> > return 0; > >> >> > } > >> >> > - cnt = sscanf(linebuf, " %n%*s%n %n%*s%n %n%*s%n %n%*s%n %d %d", > >> >> > - n, n+1, n+2, n+3, n+4, n+5, n+6, n+7, > >> >> > - &mnt->mnt_freq, &mnt->mnt_passno); > >> >> > - } while (cnt < 2 || linebuf[n[0]] == '#'); > >> >> > + > >> >> > + len = strlen(linebuf); > >> >> > + for (i = 0; i < sizeof n / sizeof *n; i++) n[i] = len; > >> >> > + if (sscanf(linebuf, " %n%*s%n %n%*s%n %n%*s%n %n%*s%n %d %d", > >> >> > + n, n+1, n+2, n+3, n+4, n+5, n+6, n+7, > >> >> > + &mnt->mnt_freq, &mnt->mnt_passno) == EOF && ferror(f)) > >> >> > + return 0; > >> >> > + } while (linebuf[n[0]] == '#'); > >> >> > > >> >> > linebuf[n[1]] = 0; > >> >> > linebuf[n[3]] = 0; > >> >> > @@ -54,6 +60,9 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle > >> >> > mnt->mnt_type = linebuf+n[4]; > >> >> > mnt->mnt_opts = linebuf+n[6]; > >> >> > > >> >> > + if (!*mnt->mnt_fsname) > >> >> > + return 0; > >> >> > + > >> >> > return mnt; > >> >> > } > >> >> > >> >> It looks like your patch changes the behavior for malformed lines from > >> >> skipping them (and continuing to search for the next valid line) to > >> >> returning 0. Is that intentional? Maybe it's better; I'm not sure. But > >> >> won't it even cause blank lines to return 0? > >> > >> Because I only check for the first field being present, a lot of > >> nonsensical lines will be accepted. > >> > >> As I said in the patch commentary: > >> > >> > After this change, only the first field is > >> > required, which matches Glibc's behaviour. > >> > > >> > [snip] > >> > > >> > We could also be stricter about it, and enforce that the first four > >> > fields are present, since the man page says only the last two are > >> > optional. Doing that would be a simple change of checking for the > >> > presence of mnt_opts instead of mnt_fsname at the end of my patch. > >> > >> It probably would make more sense to check that the four fields the man > >> pages implies are required are all there, by making the change I > >> suggested in the commentary, at least until somebody complains about > >> their two-field fstab being accepted by Glibc and not Musl. > >> > >> > Indeed it also seems to be skipping empty lines, contrary to what you > >> > said in another message: > >> > > >> >> • Empty lines should be skipped. > >> > >> Yes, it looks like I was mistaken before when I thought that Musl didn't > >> properly handle comments and empty lines. Looking back, it seems that > >> the tests I was running were against mntent files with only four fields, > >> so the parsing failures I was seeing were because of that, not because > >> of an issue in the comment or empty line handling. > >> > >> My patch does (inadventently) change the behaviour of empty line > >> handling. We should leave the current behaviour of skipping over empty > >> lines as is. Suggested fix at the end of this message. > >> > >> > Do you have a preference on how to proceed? We could add back a > >> > condition to the while loop, something like linebuf[n[0]]=='#' || > >> > n[6]==len (i.e. skip lines with too few fields, possibly using a > >> > different number instead of 6 if more appropriate). Or we could do > >> > what I suggested before: > >> > > >> >> A less invasive change might be adding "%1[ \t\n\v\f\r]" and a dummy > >> >> char* argument to collect the value before the " %d %d". Then you can > >> >> check for cnt<1. But I'm not sure even the 4th field should be > >> >> mandatory. This same apprach could be used to make just 3 mandatory if > >> >> desired though. > >> > > >> > Thoughts? > >> > >> I think it would clearer to have an explicit check that the last > >> mandatory field is set, like I currently do with the mnt_fsname check at > >> the end of the function. I don't particularly mind how many fields are > >> mandatory, as long as its four or fewer so Musl's behaviour follows the > >> fstab format described in the man page. > >> > >> So overall my proposed revisions would be the following. There's a > >> change to move to the next line if the current one is empty, and a > >> change to ensure the first four fields are all present. (If you decide > >> you'd like the fourth field to also be optional, we can just change > >> mnt_opts to mnt_type in that check.) It's been a long time since I last > >> this code, btw, so I hope I'm not missing anything around the empty line > >> check. I'd be happy to put together a revised series, with these > >> changes and a corresponding change to my libc-test patch, if you'd like. > >> > >> diff --git i/src/misc/mntent.c w/src/misc/mntent.c > >> index 169e9789..7782cb10 100644 > >> --- i/src/misc/mntent.c > >> +++ w/src/misc/mntent.c > >> @@ -47,7 +47,7 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle > >> n, n+1, n+2, n+3, n+4, n+5, n+6, n+7, > >> &mnt->mnt_freq, &mnt->mnt_passno) == EOF && ferror(f)) > >> return 0; > >> - } while (linebuf[n[0]] == '#'); > >> + } while (linebuf[0] == '\n' || linebuf[n[0]] == '#'); > >> > >> linebuf[n[1]] = 0; > >> linebuf[n[3]] = 0; > >> @@ -59,7 +59,7 @@ struct mntent *getmntent_r(FILE *f, struct mntent *mnt, char *linebuf, int bufle > >> mnt->mnt_type = linebuf+n[4]; > >> mnt->mnt_opts = linebuf+n[6]; > >> > >> - if (!*mnt->mnt_fsname) > >> + if (!*mnt->mnt_opts) > >> return 0; > >> > >> return mnt; > > > > What I still don't like here is that this changes the behavior on > > something that's not a valid record (in whatever sense we define that) > > from continuing the loop looking for the next one, to returning a null > > pointer with no indication of what the error was. Treating empty lines > > as an error (rather than continuing) was just one special case of > > that. For an analogy, see how the pwd/grp functions work. Something > > malformed in the file is just "not a record" and search continues for > > the next valid record (if any) rather than giving the caller a > > non-actionable (since this is assumed to be a sort of > > trusted/authoritative data outside the application's control) error. Just getting back to this now after the release.. Apologies for the long delay. > Okay, that makes sense. So it seems like our choices when faced with an > invalid record are: > > • Skip it and move on to the next one, as you've proposed here; or > > • Try to fill in as many fields of the mntent structure as possible, > and return successfully, like Glibc does. > > If we go with the first option, as you've proposed, do you think the > difference in behaviour from Glibc would be an issue? I think the latter is better. If I remember correctly what we'd found, glibc hardly has any condition on what's mandatory (just first field? or first and second?) so I think the condition would be something like: ... } while (linebuf[n[0]] == '#' || n[1]==len); or similar, possibly replacing 1 with 3 if 2 fields are mandatory, etc. > (I'm mostly thinking of the case where a record doesn't have enough > fields here. There are also the cases where a record has too many > fields, or when fields 5 and 6 are numeric. I haven't looked into how > Glibc handles those yet.) A few other changes I think should be made: - The two numeric fields should be read into temporaries initialized with default values so that it doesn't matter if they're read or not. - The n[] array should be changed to size_t[] and the %n's to %zn's. This should actually be done as a separate change, as it's a fix for a bug overlooked when 05973dc3bbc1aca9b3c8347de6879ed72147ab3b made the buffer length potentially longer than INT_MAX. - There's also an independent bug in hasmntent that was reported a long time ago then lost: it will return false positives when one mntopt name is a substring of another. strstr is just not the right operation here, at least not without added logic to ensure matching on a whole option boundary. This is a separate issue that calls for a separate patch though, not a blocker on the patch under discussion here. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.