Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5538BA11.90402@skarnet.org>
Date: Thu, 23 Apr 2015 11:23:29 +0200
From: Laurent Bercot <ska-dietlibc@...rnet.org>
To: musl@...ts.openwall.com
Subject: Re: setenv if value=NULL, what say standard? Bug?

On 23/04/2015 06:24, Jean-Marc Pigeon wrote:
> Think about this, you write an application working perfectly right,
> but 1 in 1000000 you reach something not trapped by low level and
> once in while the application (in production for month) just stop
> to work because "unexpected" within musl...

  And why do you think the problem exists in the first place ?
Because other libcs were defensive and failed to fail early, so the
bug was never discovered until now. Your application is not working
perfectly right - it is buggy, and it *should* fail. musl is giving
developers a gift that other libcs do not: it helps them debug.


> (so someone will propose to set a cron to automatically restart this
> unreliable daemon, hmmm...)

  You want to be defensive, well, yeah, this is the place to be
defensive. Until the bug is found and fixed, at least the daemon is
kind of providing service.

  Raphael says this behaviour is wrong for the same reason that
silently failing is wrong, but I disagree. First, restarting crashing
daemons is not silent at all, a crash is always a loud warning and
can hardly be ignored; and second, restarting a process is not
continuing it. A process can always be restarted from a clean state
and work in a predictable way until it trips the bug again, whereas
silently ignoring UB makes the process unpredictable for the rest of
its lifetime.


> Far better to return "trouble" status, then it is to the application
> to decide what must be done in context, as ignore, override, bypass,
> crash, etc.

  What "trouble" status do you return when a function dereferences a
NULL pointer ? This is exactly what's happening here. Passing NULL
to setenv is as incorrect as dereferencing NULL, and should result
in the same behaviour.


> A sensible policy in case of UB would be for such low level code to
> swallow the problem, (protect the hardware and keep the program
> running as much as possible).

  The language you want is Javascript, not C.


> As reported, the crashing application is hwclock, (util-linux-2.26),
> this a kind of code in the field for a very  very long time, so the
> library (glibc and old libc) used for linux over the years defined an
> expected behavior to this "UB".

  And this is why musl is so much better. If glibc and uclibc devs
hadn't been so complacent, the bug wouldn't have lived for so long.


> Crashing is not an option for code pertaining to musl/libc layer.

  It definitely is. You don't want your program to crash ? Don't
invoke UB.
  If you want to be "safe", you can ignore SIGSEGV at the start of
all your applications - it will be the exact same thing as what you
are asking. Your daemons will live longer, I guarantee it.


> (:-} why bother to return an error, just crash for all
> problems in open, close, write, etc. just bringing the crashing
> concept to the extreme :-}).

  Straw man. You know as well as we do the difference between a
programming error and a run-time error.


> My experience (for a long time now) about writing complex daemon
> running for months/year, it is not that straightforward (may
> be for a simple application it is)

  And mine is that it is. We're evens, now please let's stop bringing
up anecdotal evidence.

-- 
  Laurent

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.