|
Message-ID: <20201113033728.GU534@brightrain.aerifal.cx> Date: Thu, 12 Nov 2020 22:37:28 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH v3] MT-fork (series) Thanks for the detailed review! Replies inline below: On Fri, Nov 13, 2020 at 12:26:55AM +0100, Szabolcs Nagy wrote: > > +static ssize_t read_loop(int fd, void *p, size_t n) > > +{ > > + for (size_t i=0; i<n; ) { > > + ssize_t l = read(fd, (char *)p+i, n-i); > > + if (l<0) { > > + if (errno==EINTR) continue; > > + else return -1; > > + } > > + if (l==0) return i; > > + i += l; > > + } > > + return n; > > +} > > + > > static void *mmap_fixed(void *p, size_t n, int prot, int flags, int fd, off_t off) > > { > > static int no_map_fixed; > > @@ -1060,13 +1073,17 @@ static struct dso *load_library(const char *name, struct dso *needed_by) > > snprintf(etc_ldso_path, sizeof etc_ldso_path, > > "%.*s/etc/ld-musl-" LDSO_ARCH ".path", > > (int)prefix_len, prefix); > > - FILE *f = fopen(etc_ldso_path, "rbe"); > > - if (f) { > > - if (getdelim(&sys_path, (size_t[1]){0}, 0, f) <= 0) { > > + fd = open(etc_ldso_path, O_RDONLY|O_CLOEXEC); > > + if (fd>=0) { > > + size_t n = 0; > > + if (!fstat(fd, &st)) n = st.st_size; > > + if ((sys_path = malloc(n+1))) > > + sys_path[n] = 0; > > + if (!sys_path || read_loop(fd, sys_path, n)<0) { > > should this handle the short read case? > > i assume we only want to support atomic updates to > the path file so there should not be a short read, > but i think rejecting read_loop(,,n)!=n is safer. We could reject != n, but it's possible to have races where you read a partial file even when the return value is n (because fstat only returned the size of the partial file). In fact the only way you'd get a return value <n is if the file were truncated after the fstat; normally you'd see the opposite, file growing after fstat. I'm not opposed to changing it but I don't think it gives any practical improvement to safety. The only actually-safe way to change this file is by atomic replacement. > > diff --git a/src/malloc/lite_malloc.c b/src/malloc/lite_malloc.c > > index f8931ba5..0f461617 100644 > > --- a/src/malloc/lite_malloc.c > > +++ b/src/malloc/lite_malloc.c > > @@ -100,4 +100,16 @@ static void *__simple_malloc(size_t n) > > return p; > > } > > > > -weak_alias(__simple_malloc, malloc); > > +weak_alias(__simple_malloc, __libc_malloc_impl); > > + > > +void *__libc_malloc(size_t n) > > +{ > > + return __libc_malloc_impl(n); > > +} > > + > > +static void *default_malloc(size_t n) > > +{ > > + return __libc_malloc_impl(n); > > +} > > + > > +weak_alias(default_malloc, malloc); > > > maybe i'm missing something but i thought it would be enough to do > > weak_alias(__simple_malloc, __libc_malloc); > static void *default_malloc(size_t n) > { > return __libc_malloc(n); > } > weak_alias(default_malloc, malloc); > > here and have strong __libc_malloc symbol in the malloc implementation. That's what I did at first, and it works, but it keeps the property of depending on order of files in the archive, which was undesirable. Previously I left it alone because it avoided jumping thru an additional entry point, but now it makes no difference for the public malloc, and the libc-internal one is not at all performance-relevant. So we're just wasting a few bytes of size here to avoid depending on the link order hack, and I think that's a reasonable trade. > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > > index 61714f40..6b868c84 100644 > > --- a/ldso/dynlink.c > > +++ b/ldso/dynlink.c > > @@ -20,6 +20,7 @@ > > #include <semaphore.h> > > #include <sys/membarrier.h> > > #include "pthread_impl.h" > > +#include "fork_impl.h" > > #include "libc.h" > > #include "dynlink.h" > > > > @@ -1426,6 +1427,17 @@ void __libc_exit_fini() > > } > > } > > > > +void __ldso_atfork(int who) > > +{ > > + if (who<0) { > > + pthread_rwlock_wrlock(&lock); > > + pthread_mutex_lock(&init_fini_lock); > > + } else { > > + pthread_mutex_unlock(&init_fini_lock); > > + pthread_rwlock_unlock(&lock); > > + } > > +} > > + > > static struct dso **queue_ctors(struct dso *dso) > > { > > size_t cnt, qpos, spos, i; > > @@ -1484,6 +1496,13 @@ static struct dso **queue_ctors(struct dso *dso) > > } > > queue[qpos] = 0; > > for (i=0; i<qpos; i++) queue[i]->mark = 0; > > + for (i=0; i<qpos; i++) > > + if (queue[i]->ctor_visitor && queue[i]->ctor_visitor->tid < 0) { > > + error("State of %s is inconsistent due to multithreaded fork\n", > > + queue[i]->name); > > + free(queue); > > + if (runtime) longjmp(*rtld_fail, 1); > > + } > > > hm since fork takes the init_fini_lock i guess > the ctors could be finished in the child if > necessary. or is there some problem with that? > > otherwise the patch looks good to me. The init_fini_lock is not held across running of ctors/dtors, only to protect access to the queue. Otherwise it would break concurrent dlopen, recursive dlopen, etc. (which can be delayed arbitrarily long by ctor execution). In fact if the lock worked like that and fork took the lock, fork would also be delayed arbitrarily long and you could not fork from ctors. The underlying issue this code is addressing is that, if a ctor was running in another thread at the time of fork, it will never finish, but it also can't be restarted because then parts of it would be executed twice. We discussed this on IRC before and the only conclusion I could come it was that the DSO is permanently unusable. Maybe it would make sense to strip the affected DSO and all that depend on it from the loaded list, so they'd be reloaded from the filesystem next time they're dlopened (with the partly loaded old mapping just abandoned). I'm not sure how bad that would be in terms of unwanted side effects. Note however that, with the way I implemented it here, you can avoid the "perma-dead DSO" problem just by taking a lock around your own calls to dlopen, and installing an atfork handler that takes that lock. This is basically "opting into" the above "fork delayed arbitrarily long" behavior I mentioned. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.