|
Message-ID: <AM5P192MB00811E790FEDDEAFD163D694C7B09@AM5P192MB0081.EURP192.PROD.OUTLOOK.COM> Date: Wed, 06 Oct 2021 14:57:55 +0200 From: Pablo Correa Gomez <ablocorrea@...mail.com> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: newlocale: Segmentation fault when locale input is NULL Thank you very much to both for the detailed explanation. I really appreciate you took the time to explain why musl behaviour should remain the way it is. I will proceed and fix the bug in GNOME. > In particular, here it seems to have found a bug -- what could the > application have possibly meant by passing a null pointer there? Did > it actually intend the behavior of ""? Or of "C"? Or if the intent > was > to have this mean "don't use a context-local locale", why pass the > pointer to newlocale and process the error (which could include a > number of other errors you'd certainly want to treat differently) > rather than checking for null and taking a different code path before > calling newlocale? Very valid point. Definitely something worth to ask the maintainers. Best and thank you again, Pablo. On Wed, 2021-10-06 at 08:29 -0400, Rich Felker wrote: > On Wed, Oct 06, 2021 at 11:31:29AM +0200, Pablo Correa Gomez wrote: > > Dear musl maintainers, > > > > While doing some work in GNOME control center for postmarketos, we > > bumped into a segmentation fault which is also present in GNOME in > > Alpine[1]. > > > > After doing some degugging, I figured out that the reason is that, > > through GNOME desktop[2], there is a call to newlocale, where they > > end > > up calling it with a NULL argument. > > > > newlocale(LC_CTYPE, NULL, (locale_t)0); > > > > In this case, "name" is passed to __get_locale in > > src/locale/newlocale.c:27 and then dereferenced in > > src/locale/locale_map.c:43, causing a segmentation fault. > > > > In the case of glibc, this is not an issue, as per the > > documentation[3] > > they consider it an error: > > > > EINVAL locale is NULL. > > > > Unfortunately, this is a difference in the implementation between > > glibc > > and musl, maybe due to the fact that the standard[4] in not clear > > in > > this point: > > > > > > The newlocale() function may fail if: > > > > [EINVAL] > > The locale argument is not a valid string pointer. > > This is specifically documented as a "may fail", not a "shall fail", > i.e. it's not guaranteed to happen. It comes from POSIX, and is an > instance of a weird pattern the committee tried to fix (but missed > some places it seems) where they wrote "may fail"s for conditions > that > already have undefined behavior (here, use of an invalid pointer) in > which case EINVAL would already be allowed as a side effect of the UB > without any further specification. (The same thing created a lot of > confusion in the past about use of pthread_t values past the end of > their lifetime.) > > > My personal believe is that adding a NULL pointer check in musl is > > very > > simple and might help not only GNOME desktop, but maybe also other > > projects in the future. This is the reason why I brought the issue > > here > > first instead of directly patching GNOME desktop. If you believe > > that > > musl behaviour should remain the way it is, please let me know and > > I > > will send MRs for upstream and Alpine's GNOME desktop. I am not > > subscribed to the mailing list, so I would appreciate if I am CC'ed > > in > > any response. > > The musl behavior should remain the way it is. My text on the > rationale actually made it into the glibc wiki some years back: > > https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers > > "The GNU C library considers it a QoI feature not to mask user > bugs by detecting invalid pointers and returning EINVAL (unless > the API is standardized and says it does that). If passing a bad > pointer has undefined behavior, it is far more useful in the long > run if it crashes quickly rather than diagnosing an error that is > probably ignored by the flaky caller. > > If you're going to check for NULL pointer arguments where you > have > not entered into a contract to accept and interpret them, do so > with an assert, not a conditional error return. This way the bugs > in the caller will be immediately detected and can be fixed, and > it makes it easy to disable the overhead in production builds. > The > assert can be valuable as code documentation. However, a segfault > from dereferencing the NULL pointer is just as effective for > debugging. If you return an error code to a caller which has > already proven itself buggy, the most likely result is that the > caller will ignore the error, and bad things will happen much > later down the line when the original cause of the error has > become difficult or impossible to track down. Why is it > reasonable > to assume the caller will ignore the error you return? Because > the > caller already ignored the error return of malloc or fopen or > some > other library-specific allocation function which returned NULL to > indicate an error." > > In particular, here it seems to have found a bug -- what could the > application have possibly meant by passing a null pointer there? Did > it actually intend the behavior of ""? Or of "C"? Or if the intent > was > to have this mean "don't use a context-local locale", why pass the > pointer to newlocale and process the error (which could include a > number of other errors you'd certainly want to treat differently) > rather than checking for null and taking a different code path before > calling newlocale? > > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.