|
|
Message-ID: <20260412151721.GY1827@brightrain.aerifal.cx>
Date: Sun, 12 Apr 2026 11:17:22 -0400
From: Rich Felker <dalias@...c.org>
To: Luca Kellermann <mailto.luca.kellermann@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] scandir: fix qsort usage
On Sun, Apr 12, 2026 at 01:43:08PM +0200, Luca Kellermann wrote:
> On Fri, Apr 10, 2026 at 01:30:27PM -0400, Rich Felker wrote:
> > On Fri, Feb 27, 2026 at 09:17:15PM +0100, Luca Kellermann wrote:
> > > calling qsort() with a pointer to a function whose type is not
> > > compatible with int(const void *, const void *) results in UB because
> > > qsort() would call this function with an incompatible type.
> > >
> > > avoid this by using qsort_r(). this is similar to how qsort() is
> > > implemented on top of qsort_r().
> > >
> > > the casts to void * in wrapper_cmp() are required because scandir()
> > > takes int (*)(const struct dirent **, const struct dirent **) and not
> > > int (*)(const struct dirent *const *, const struct dirent *const *).
> > > ---
> > > Since you haven't posted a patch yet, I went ahead and fixed this. The
> > > patch is based on the scandir() patch series I submitted in July.
> > >
> > > src/dirent/scandir.c | 12 +++++++++---
> > > 1 file changed, 9 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/src/dirent/scandir.c b/src/dirent/scandir.c
> > > index 9e4e623f..993e0c8b 100644
> > > --- a/src/dirent/scandir.c
> > > +++ b/src/dirent/scandir.c
> > > @@ -7,9 +7,15 @@
> > > #include <errno.h>
> > > #include <stddef.h>
> > >
> > > +typedef int (*cmpfun)(const struct dirent **, const struct dirent **);
> > > +
> > > +static int wrapper_cmp(const void *de1, const void *de2, void *cmp)
> > > +{
> > > + return ((cmpfun)cmp)((void *)de1, (void *)de2);
> > > +}
> > > +
> >
> > I was going to ask you what the (void *) casts are for, but now I see
> > it. The scandir comparison function was specified wrong and takes as
> > const T ** rather than a T *const * like it should.
>
> I anticipated that this might raise some questions. That's why I tried
> to explain it in the commit message :)
>
> > Unfortunately this also reveals an aliasing violation elsewhere. The
> > lvalue type the caller's comparison function will be using to access
> > elements of the names[] array are of type const struct dirent *, but
> > the actual elements of the names[] array have type struct dirent *. I
> > can prepare a separate patch to address this.
>
> I'm curious what fix you have in mind for this. The only thing I can
> think of is to change the type of the names array to const struct
> dirent ** and after sorting copy it back to a new array of type struct
> dirent ** that can then be stored in *res.
Uhg, I missed that the array returned in *res mismatches the type of
the array to be sorted; I was thinking we could just adjust it to
const. Indeed copying would be gratuitously awful. Maybe this is
fixable just with some union punning. Or.. wrapper_cmp could just copy
the pointer values into local vars and pass the addresses of those
local vars to the cmp function. This is probably the most correct.
> > > int scandir(const char *path, struct dirent ***res,
> > > - int (*sel)(const struct dirent *),
> > > - int (*cmp)(const struct dirent **, const struct dirent **))
> > > + int (*sel)(const struct dirent *), cmpfun cmp)
> > > {
> >
> > I would avoid changing the public signature to be defined in terms of
> > an implementation-internal typedef. It just makes it less obvious that
> > it's correct.
>
> I was mimicking qsort_nr.c here, but sure. v2 attached (still based on
> my other scandir() patch series).
Ah, I see. That probably should not have been done either. With the
shorter file it's more immediately obvious what the type is, but I
think in general I'd like the signature in the definition of public
interfaces to match the public signature directly without having to
read typedefs to see that it matches.
> > My preference would be to avoid making the typedef entirely, but I'm
> > fairly indifferent if you like it in wrapper_cmp.
>
> I agree, the typedef ist no longer useful if it's only used in one
> place.
OK.
> > > DIR *d;
> > > struct dirent *de, **names=0, **tmp;
> > > @@ -70,7 +76,7 @@ int scandir(const char *path, struct dirent ***res,
> > > /* cmp() and caller must not observe that errno was set to 0. */
> > > errno = old_errno;
> > >
> > > - if (cmp) qsort(names, cnt, sizeof *names, (int (*)(const void *, const void *))cmp);
> > > + if (cmp) __qsort_r(names, cnt, sizeof *names, wrapper_cmp, (void *)cmp);
> > > *res = names;
> > > return cnt;
> > > }
> >
> > Use of the __ version probably isn't strictly needed here, but better
> > anyway.
>
> Another reason why I decided to go with the __ version was that I
> didn't need to define a feature test macro to use it. qsort_r() is
> currently only exposed if _GNU_SOURCE or _BSD_SOURCE are defined.
Ah, that's another thing in its favor, on top of generating slightly
more efficient code.
Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.