Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150305024426.GL23507@brightrain.aerifal.cx>
Date: Wed, 4 Mar 2015 21:44:26 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: getenv_r

On Wed, Mar 04, 2015 at 05:34:57PM -0800, William Ahern wrote:
> On Thu, Mar 05, 2015 at 01:41:33AM +0100, Szabolcs Nagy wrote:
> > * William Ahern <william@...handClement.com> [2015-03-04 15:09:20 -0800]:
> > > I noticed that getenv is not thread-safe. Would there be any interest in
> > > accepting a patch to implement getenv_r (a NetBSD function) and internal
> > > locking? Other than leaving getenv, setenv, and putenv unsafe in threaded
> > > environments, the only other alternative is the ugliness that glibc,
> > > Solaris, and some others implement, which is basically to leak environ
> > > memory.
> > 
> > getenv is thread-safe if the environ is not modified
> > 
> > in a multi-threaded application environ modification
> > is unsafe and problematic even if you do the locks:
> > different threads may want different value for an env
> > var and when you read an env var you cannot know if
> > it is up to date when you want to use it.
> 
> That criticism applies to almost any software, no matter the interface, with
> or without locks. Locks don't solve all TOCTTOU bugs, either.
> 
> Anyhow, any use of setenv is unsafe in a multi-threaded environment, even
> for different variables. In musl the __environ array is invalidated by a
> setenv call that needs to grow it. The system(3) implementation in musl
> passes __environ to posix_spawn.

Indeed. Multi-threaded programs cannot modify the environment.

> > does netbsd use getenv_r somewhere to solve some issue?
> 
> There are a few uses in NetBSD, such as in librump, but AFAICT getenv_r
> isn't widely used. It just seems a much nicer interface than the contortions
> other libc libraries go through.
> 
> One obvious use for a thread-safe setenv and getenv is when trying to
> generate a time_t timestamp from a UTC struct tm. timegm is not standard.
> The glibc manual page suggests to instead set the TZ environment variable to
> UTC, call tzset, call mktime, restore TZ, and call tzset again.

This is very bad advice. The idiom is completely unsafe. As you noted,
glibc and musl both provide timegm() which is analogous to mktime but
works in UTC. You're right that it's non-standard, but neither is
getenv_r, and timegm actually is present on existing systems whereas
getenv_r is not.

Alternatively, you can just use the formula in POSIX XBD 4.15 Seconds
Since the Epoch:

tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 +
    (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 -
    ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400

> You can't use that technique portably from multiple threads unless all code
> that might call setenv for any reason synchronizes on the same lock you use
> to implement timegm (including any locale code that needs to read LC_
> values). Whereas the only code _setting_ TZ is likely to be the timegm code.
> 
> But really the issue is most intractable in my position, where I'm trying to
> implement a bindings module. In Lua especially, where the Lua VM has no
> global state, it's simple to run Lua scripts in a multi-threaded environment
> and have them mostly just work without issue. But those scripts are
> exceedingly unlikely to have been concerned with thread-safety, and with
> setenv being unsafe to use. And they can't be expected to use
> synchronization primatives because module A might have no relationship to
> module B.

Why don't you just bind the Lua setenv function to work with your own
environment strings list that has nothing to do with the actual
process's? This is the correct way to write a shell too -- it should
never touch its own environment, but instead maintain shell variables
itself and export them as needed to external programs it invokes.

> I can't paper-over all thread issues, but I still think it worthwhile to
> make it as safe as possible. No amount of finger pointing would ever fix the
> problem.
> 
> Granted, at the end of the day this may be none of musl's concern. Adding
> getenv_r certainly won't solve all issues. That would require other
> measures. But it does seem like a worthwhile QoI issue to address. I mean,
> presumably the system implementation uses posix_spawn rather out of concern
> for QoI. musl could have punted and simply disclaimed any support for
> threaded environments.

I'm not sure I follow. If you're talking about system(), that function
itself is (specified to be) non-thread-safe and should not be used.
Use of posix_spawn in other places is a QoI consideration, but in
system it was more an exercise and an example of how code like this
should be written. The old version used vfork and was buggy, so it
needed to be replaced anyway, and using posix_spawn seemed natural.
Also, I suppose eventually some people might care about NOMMU systems.

In any case, I'm not sure what QoI issue you think should be addressed
here. The fact that modifying the environment is not thread-safe does
not seem to be fixable by something like getenv_r. And I'm mostly
convinced that the only right solution is to treat the environment as
immutable.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.