Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190724130034.GC1506@brightrain.aerifal.cx>
Date: Wed, 24 Jul 2019 09:00:34 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] glob: implement GLOB_TILDE

On Wed, Jul 24, 2019 at 09:15:22AM +0200, Ismael Luceno wrote:
> On 23/Jul/2019 16:07, Rich Felker wrote:
> > On Tue, Jul 23, 2019 at 08:33:54PM +0200, Ismael Luceno wrote:
> > > Signed-off-by: Ismael Luceno <ismael@...ev.co.uk>
> > > ---
> > >  include/glob.h   |  1 +
> > >  src/regex/glob.c | 26 ++++++++++++++++++++++++++
> > >  2 files changed, 27 insertions(+)
> > > 
> > > diff --git a/include/glob.h b/include/glob.h
> > > index 76f6c1c68a23..fc8106b20c5b 100644
> > > --- a/include/glob.h
> > > +++ b/include/glob.h
> > > @@ -30,6 +30,7 @@ void globfree(glob_t *);
> > >  #define GLOB_APPEND   0x20
> > >  #define GLOB_NOESCAPE 0x40
> > >  #define	GLOB_PERIOD   0x80
> > > +#define GLOB_TILDE    0x100
> > >  
> > >  #define GLOB_NOSPACE 1
> > >  #define GLOB_ABORTED 2
> > > diff --git a/src/regex/glob.c b/src/regex/glob.c
> > > index aa1c6a4482ee..77b81f707420 100644
> > > --- a/src/regex/glob.c
> > > +++ b/src/regex/glob.c
> > > @@ -8,6 +8,8 @@
> > >  #include <stdlib.h>
> > >  #include <errno.h>
> > >  #include <stddef.h>
> > > +#include <unistd.h>
> > > +#include "../passwd/pwf.h"
> > >  
> > >  struct match
> > >  {
> > > @@ -35,6 +37,30 @@ static int do_glob(char *buf, size_t pos, int type, char *pat, int flags, int (*
> > >  	/* If GLOB_MARK is unused, we don't care about type. */
> > >  	if (!type && !(flags & GLOB_MARK)) type = DT_REG;
> > >  
> > > +	if ((flags & GLOB_TILDE) && *pat == '~') {
> > > +		struct passwd pw, *res;
> > > +		char *buf = NULL;
> > > +		size_t len = 0;
> > > +		char *name_end = strchr(++pat, '/');
> > > +		if (name_end)
> > > +			*name_end = 0;
> > > +		uid_t uid = *pat ? 0 : getuid();
> > > +		int err = __getpw_a(pat, uid, &pw, &buf, &len, &res);
> > > +		if (!err && res) {
> > > +			while (pos < PATH_MAX - 1 && *pw.pw_dir)
> > > +				buf[pos++] = *pw.pw_dir++;
> > > +		}
> > > +		free(buf);
> > > +		if (err)
> > > +			return GLOB_NOSPACE;
> > > +		if (!res)
> > > +			return 0;
> > > +		if (name_end) {
> > > +			pat = name_end;
> > > +			*pat = '/';
> > > +		}
> > > +	}
> > 
> > This should almost surely go in the top-level glob function rather
> > than the recusive function, where the original strdup of pat happens.
> > As you've written it, I think it wrongly applies tilde expansion to
> > each path component into which recursion happens, but not to ones
> > where it doesn't, whereas it should only apply to the first component.
> > Doing this in the recursive function also has the problem of exploding
> > stack usage.  One thing I don't see is how the code is working at all
> > right now, because you've shadowed the argument buf with a locally
> > scoped variable by the same name that's the working space buffer
> > passed to __getpw_a. The result directory is never copied out into the
> > actual buf array.
> 
> Indeed it doesn't work XD.
> 
> I meant it more as a quick try and to see if it was interesting and the
> right kind of approach.

It is interesting. Adding this functionality is a tradeoff, since it
does pull in linking of more stuff when glob is used even without
GLOB_TILDE, but since glob already inherently depends on dynamic
allocation it's not a huge issue, and this has already been planned
functionality for a while (it's on the roadmap).

The approach is somewhat right, modulo points in my previous review I
think.

> > One thing to note at this point: it's generally preferable to use the
> > public interfaces (getpwnam_r) rather than the internal ones from a
> > different sybsystem of libc (__getpw_a) unless there's a really
> > compelling reason not to. Here I think it would actually be easier
> > using the public interface -- you could reuse buf for the buffer space
> > you pass to getpwnam_r, then memmove from pw_dir (which lies somewhere
> > inside buf, you don't know where) to the start of buf.
> 
> The reason I had to use the private interface is to avoid dealing with
> the buffer allocation on the side of the glob function.

As noted, I think you can just reuse buf[] and avoid any dynamic
allocation visible in glob (of course it happens internally in
getpwnam_r but that's hidden). The only way it's not going to fit is
if the user's homedir doesn't fit (or barely fits) in PATH_MAX,
meaning they wouldn't be able to access anything in it with absolute
paths to begin with.

> It would be better to keep it that way.

To clarify, the reason to avoid doing this unless there's a really
strong motivation is to avoid cross-component coupling. If for example
someone needed to change the signature of __getpw_a as part of changes
in the pwd implementation, they'd have to figure out how glob is using
it and how to adapt glob to the new change, rather than just focusing
on the subsystem they're working on and familiar with. By using public
interfaces, you avoid that.

> > There's also a matter of intended behavor: my understanding is that
> > plain ~ without a name is supposed to expand to $HOME, not lookup the
> > caller's passwd entry based on uid (which is broken on setups where
> > you have multiple login accounts with the same uid). The man page
> > doesn't document this at all. I looked just now in the glibc manual
> > and it documents the behavior in terms of getlogin+getpwnam, which is
> > utterly broken -- it doesn't work at all in the absence of a login
> > session associated with a terminal (except in musl where we wrongly
> > just return getenv("LOGNAME"), a known/open bug). and it's also
> > inconsistent with the standard shell behavior, which is to use $HOME:
> > 
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_01
> > 
> > So I feel like, if we implement GLOB_TILDE, we should do it the way
> > that's consistent with the standard meaning of ~, using $HOME, not the
> > glibc implementation. This is also much more user-friendly, in that it
> > honors manual override of $HOME.
> 
> I'm not entirely comfortable expanding HOME unconditionally, as it could
> easily be exploited into a vulnerability for any privileged (e.g.
> setuid) process.

I understand and appreciate your concern, but I don't think that can
affect any use that's not already horribly insecure for other reasons.
If glob is being used with GLOB_TILDE in a suid context, one of the
following holds:

- The input pattern is provided by the user, in which case the user
  already fully controls the output, regardless of whether it contains
  ~ or how ~ expands.

- The input pattern is fixed and does not start with plain ~, in which
  case how ~ is expanded is irrelevant.

- The input pattern is fixed and does start with plain ~, in which
  case the program is explicitly requesting output that's under the
  control of the invoking user by virtue of controlling the contents
  of their home directory.

Naturally the cases where the suid program is operating on data
controlled by the invoking user are usually a bad idea, and if done at
all need to be done under extreme care, but I don't see how honoring
the standard meaning of ~ makes that any moreso. The outputs of glob
are always literal pathname strings, and have to be treated as such,
not pasted into command lines or used as format strings or anything
like that.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.