|
Message-ID: <20160320002356.GX21636@brightrain.aerifal.cx> Date: Sat, 19 Mar 2016 20:23:56 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH 1/3] overhaul environment functions On Sat, Mar 19, 2016 at 09:11:16PM +0300, Alexander Monakov wrote: > On Sat, 19 Mar 2016, Rich Felker wrote: > > > putenv: > > > * handle "=value" input via unsetenv too (will return -1/EINVAL); > > > > What happens now? It adds an env var with zero-length name? > > Returns -1 without modifying errno. OK, so not that bad, but a good fix. > > > Not changed: > > > Failure to extend allocation tracking array (previously __env_map, now > > > env_alloced) is ignored rather than causing to report -1/ENOMEM to the > > > caller; the worst-case consequence is leaking this allocation when it > > > is removed or replaced in a subsequent environment access. > > > > I'd like to fix this too but it can wait. I think it should be trivial > > to do as a small patch on top. > > I thought about that, but decided to avoid doing that in the first cut > (__env_change return type changes to 'int', dummy definitions need to be > adjusted accordingly, and oom handling in __env_change is not beautiful > either) OK. > > > --- a/src/env/putenv.c > > > +++ b/src/env/putenv.c > > > @@ -1,58 +1,47 @@ > > > +#include <stdlib.h> > > > +#include <string.h> > > > +#include "libc.h" > > > + > > > +char *__strchrnul(const char *, int); > > > + > > > +static void dummy(char *p, char *r) {} > > > +weak_alias(dummy, __env_change); > > > + > > > +int __putenv(char *s, size_t l, char *r) > > > +{ > > > + size_t i=0; > > > + if (__environ) > > > + for (; __environ[i]; i++) > > > + if (!strncmp(__environ[i], s, l+1)) { > > > + char *tmp = __environ[i]; > > > + __environ[i] = s; > > > + __env_change(tmp, r); > > > + return 0; > > > + } > > > > As far as I can tell, this leaves multiple definitions in place. Am I > > missing something? Maybe it's only unset and not replacement that's > > safe against multiple-definition madness? > > Removing multiple definitions is what patch 2/3 does. This patch just > keeps the status quo (only unsetenv looks at duplicate definitions). Ah yes, somehow I missed that. > > > + static char **oldenv; > > > + char **newenv; > > > + if (__environ == oldenv) { > > > + newenv = realloc(oldenv, sizeof *newenv * (i+2)); > > > + if (!newenv) goto oom; > > > + } else { > > > + newenv = malloc(sizeof *newenv * (i+2)); > > > + if (!newenv) goto oom; > > > + if (i) memcpy(newenv, __environ, sizeof *newenv * i); > > > + free(oldenv); > > > + } > > > > Rather than using malloc when __environ != oldenv, I think we should > > use realloc on oldenv, so that we don't leak internally-allocated > > environ arrays if the program repeatedly does environ=0 or calls your > > new clearenv. > > How can we leak internally allocated environ here? If there's one, oldenv > points to it, and we free it right after memcpy. for (;;) { clearenv(); putenv("FOO=BAR"); } Since __environ!=oldenv at the time of every putenv call, each call allocates a new environment array and the old one is leaked. > I think realloc can be used if the program does not modify environ, but if > it does something funky like 'environ++[2] = 0;' then memcpy'ing after realloc > is not safe (unlike doing malloc-memcpy-free as above). I see. I don't think that should be permitted, but maybe it is, and it could be caught as a special case if we want: we can check if __environ points into the internally allocated array (either by invalid comparison of pointers into different arrays, or by an O(n) search through the internal array to see if any of its slots have address equal to __environ) and just leak in that case. Alternatively we could do malloc+memcpy+free in all cases instead of realloc. I probably have not yet thought enough about this to have a good idea on what's the best thing to do, so perhaps just leave the leak for now. > > Perhaps we should also store the allocated size and grow > > it exponentially rather than assuming it's only the filled size and > > calling realloc every time, but maybe it's actually better to resize > > up/down. Do you have an opinion? > > Some fraction of times realloc will keep it in place due to binning, right? > I think taking 'simplicity' in efficiency-simplicity tradeoff here is > justified, on the basis that majority of software does not repeatedly change > environment, and for shells this libc facility is of little help anyway. Increasingly doing realloc by 1 slot each time is classic "accidentally quadratic". I agree that it probably doesn't matter in practice though. > > > int setenv(const char *var, const char *value, int overwrite) > > > { > > > char *s; > > > - int l1, l2; > > > - > > > - if (!var || !*var || strchr(var, '=')) { > > > + size_t l1 = __strchrnul(var, '=') - var, l2; > > > + if (!l1 || var[l1]) { > > > errno = EINVAL; > > > return -1; > > > } > > > > As mentioned above we should probably keep the POSIX-old behavior of > > accepting a null pointer and treating it as a diagnosed error. > > OK; in an old revision I had 'if (!var || !(l1 = __strchrnul(var, '=')) ...' Yes that's probably best, albeit uglier. > > > if (!overwrite && getenv(var)) return 0; > > > > > > - l1 = strlen(var); > > > l2 = strlen(value); > > > s = malloc(l1+l2+2); > > > - if (s) { > > > - memcpy(s, var, l1); > > > - s[l1] = '='; > > > - memcpy(s+l1+1, value, l2); > > > - s[l1+l2+1] = 0; > > > - if (!__putenv(s, 1)) return 0; > > > - } > > > - free(s); > > > - return -1; > > > + if (!s) return -1; > > > + memcpy(s, var, l1); > > > + s[l1] = '='; > > > + memcpy(s+l1+1, value, l2+1); > > > + return __putenv(s, l1, s); > > > } > > > > This leaks when __putenv fails. It's no worse than before but should > > probably be fixed. > > Only when __env_change, specifically, "fails"; not when __putenv OOMs. Why? s points to memory allocated by malloc, and I don't see anywhere it's freed if __putenv fails. > > > > +#include <stdlib.h> > > > +#include <string.h> > > > +#include <errno.h> > > > +#include "libc.h" > > > + > > > +char *__strchrnul(const char *, int); > > > + > > > +static void dummy(char *p, char *r) {} > > > +weak_alias(dummy, __env_change); > > > + > > > +int unsetenv(const char *name) > > > +{ > > > + size_t l = __strchrnul(name, '=') - name; > > > + if (!l || name[l]) { > > > + errno = EINVAL; > > > + return -1; > > > + } > > > + if (!__environ) return 0; > > > + for (char **e = __environ; *e; e++) > > > + while (*e && !strncmp(name, *e, l) && l[*e] == '=') { > > > + char **ee = e, *tmp = *e; > > > + do *ee = *(ee+1); > > > + while (*++ee); > > > + __env_change(tmp, 0); > > > + } > > > + return 0; > > > +} > > > > I think this looks ok. > > I'd like to add a minor tweak here: instead of retesting '*e' in 'while' loop > header, do 'if (!*e) return 0;' after __env_change; this helps the compiler to > generate slightly cleaner code. OK. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.