|
|
Message-ID: <alpine.LNX.2.20.1603192015120.10468@monopod.intra.ispras.ru>
Date: Sat, 19 Mar 2016 21:11:16 +0300 (MSK)
From: Alexander Monakov <amonakov@...ras.ru>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 1/3] overhaul environment functions
On Sat, 19 Mar 2016, Rich Felker wrote:
> > putenv:
> > * handle "=value" input via unsetenv too (will return -1/EINVAL);
>
> What happens now? It adds an env var with zero-length name?
Returns -1 without modifying errno.
> > Not changed:
> > Failure to extend allocation tracking array (previously __env_map, now
> > env_alloced) is ignored rather than causing to report -1/ENOMEM to the
> > caller; the worst-case consequence is leaking this allocation when it
> > is removed or replaced in a subsequent environment access.
>
> I'd like to fix this too but it can wait. I think it should be trivial
> to do as a small patch on top.
I thought about that, but decided to avoid doing that in the first cut
(__env_change return type changes to 'int', dummy definitions need to be
adjusted accordingly, and oom handling in __env_change is not beautiful
either)
> > --- a/src/env/putenv.c
> > +++ b/src/env/putenv.c
> > @@ -1,58 +1,47 @@
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int __putenv(char *s, size_t l, char *r)
> > +{
> > + size_t i=0;
> > + if (__environ)
> > + for (; __environ[i]; i++)
> > + if (!strncmp(__environ[i], s, l+1)) {
> > + char *tmp = __environ[i];
> > + __environ[i] = s;
> > + __env_change(tmp, r);
> > + return 0;
> > + }
>
> As far as I can tell, this leaves multiple definitions in place. Am I
> missing something? Maybe it's only unset and not replacement that's
> safe against multiple-definition madness?
Removing multiple definitions is what patch 2/3 does. This patch just
keeps the status quo (only unsetenv looks at duplicate definitions).
>
> > + static char **oldenv;
> > + char **newenv;
> > + if (__environ == oldenv) {
> > + newenv = realloc(oldenv, sizeof *newenv * (i+2));
> > + if (!newenv) goto oom;
> > + } else {
> > + newenv = malloc(sizeof *newenv * (i+2));
> > + if (!newenv) goto oom;
> > + if (i) memcpy(newenv, __environ, sizeof *newenv * i);
> > + free(oldenv);
> > + }
>
> Rather than using malloc when __environ != oldenv, I think we should
> use realloc on oldenv, so that we don't leak internally-allocated
> environ arrays if the program repeatedly does environ=0 or calls your
> new clearenv.
How can we leak internally allocated environ here? If there's one, oldenv
points to it, and we free it right after memcpy.
I think realloc can be used if the program does not modify environ, but if
it does something funky like 'environ++[2] = 0;' then memcpy'ing after realloc
is not safe (unlike doing malloc-memcpy-free as above).
> Perhaps we should also store the allocated size and grow
> it exponentially rather than assuming it's only the filled size and
> calling realloc every time, but maybe it's actually better to resize
> up/down. Do you have an opinion?
Some fraction of times realloc will keep it in place due to binning, right?
I think taking 'simplicity' in efficiency-simplicity tradeoff here is
justified, on the basis that majority of software does not repeatedly change
environment, and for shells this libc facility is of little help anyway.
> > int setenv(const char *var, const char *value, int overwrite)
> > {
> > char *s;
> > - int l1, l2;
> > -
> > - if (!var || !*var || strchr(var, '=')) {
> > + size_t l1 = __strchrnul(var, '=') - var, l2;
> > + if (!l1 || var[l1]) {
> > errno = EINVAL;
> > return -1;
> > }
>
> As mentioned above we should probably keep the POSIX-old behavior of
> accepting a null pointer and treating it as a diagnosed error.
OK; in an old revision I had 'if (!var || !(l1 = __strchrnul(var, '=')) ...'
> > if (!overwrite && getenv(var)) return 0;
> >
> > - l1 = strlen(var);
> > l2 = strlen(value);
> > s = malloc(l1+l2+2);
> > - if (s) {
> > - memcpy(s, var, l1);
> > - s[l1] = '=';
> > - memcpy(s+l1+1, value, l2);
> > - s[l1+l2+1] = 0;
> > - if (!__putenv(s, 1)) return 0;
> > - }
> > - free(s);
> > - return -1;
> > + if (!s) return -1;
> > + memcpy(s, var, l1);
> > + s[l1] = '=';
> > + memcpy(s+l1+1, value, l2+1);
> > + return __putenv(s, l1, s);
> > }
>
> This leaks when __putenv fails. It's no worse than before but should
> probably be fixed.
Only when __env_change, specifically, "fails"; not when __putenv OOMs.
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <errno.h>
> > +#include "libc.h"
> > +
> > +char *__strchrnul(const char *, int);
> > +
> > +static void dummy(char *p, char *r) {}
> > +weak_alias(dummy, __env_change);
> > +
> > +int unsetenv(const char *name)
> > +{
> > + size_t l = __strchrnul(name, '=') - name;
> > + if (!l || name[l]) {
> > + errno = EINVAL;
> > + return -1;
> > + }
> > + if (!__environ) return 0;
> > + for (char **e = __environ; *e; e++)
> > + while (*e && !strncmp(name, *e, l) && l[*e] == '=') {
> > + char **ee = e, *tmp = *e;
> > + do *ee = *(ee+1);
> > + while (*++ee);
> > + __env_change(tmp, 0);
> > + }
> > + return 0;
> > +}
>
> I think this looks ok.
I'd like to add a minor tweak here: instead of retesting '*e' in 'while' loop
header, do 'if (!*e) return 0;' after __env_change; this helps the compiler to
generate slightly cleaner code.
Thanks for the review!
Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.