Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f8ae3cd5d1b61b9d9b55b63ace4dbde2@ispras.ru>
Date: Tue, 18 Oct 2022 20:07:34 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: musl@...ts.openwall.com
Cc: Ben Hillis <Ben.Hillis@...rosoft.com>, Brian Perkins
 <Brian.Perkins@...rosoft.com>, Peter Martincic <pmartincic@...rosoft.com>,
 Pierre Boulay <Pierre.Boulay@...rosoft.com>
Subject: Re: Potential deadlock when fork() is called while
 __aio_get_queue() is allocating memory

On 2022-10-18 19:51, Pierre Boulay wrote:

> Hello,
> 
> I'm working on the Windows Subsystem Linux, which uses MUSL and I'm 
> currently working on solving WSL complete freeze · Issue #8824 · 
> microsoft/WSL (github.com).
> 
> I believe that this freeze is caused by a bug causing a deadlock in 
> MUSL.
> 
> If we look at this sample program:
> 
> #include <unistd.h>
> #include <pthread.h>
> #include <aio.h>
> #include <fcntl.h>
> #include <time.h>
> #include <stdio.h>
> #include <errno.h>
> 
> void* thread2(void* unused) // Thread 2
> {
> unsigned char buf[256] = {0};
> struct aiocb aiocb = {};
> aiocb.aio_fildes = open("/proc/self/cmdline", O_RDONLY, S_IRUSR | 
> S_IWUSR);
> aiocb.aio_buf = buf;
> aiocb.aio_nbytes = sizeof(buf) - 1;
> aiocb.aio_lio_opcode = LIO_READ;
> 
> if (aio_read(&aiocb) < 0)
> {
> printf("aio_read failed, %i\n", errno);
> return NULL;
> }
> 
> int error = -1;
> while(error = aio_error(&aiocb) == EINPROGRESS)
> {
> printf("In progress...\n");
> sleep(1);
> }
> 
> if (error != 0)
> {
> printf("aio_error: %i\n", error);
> }
> 
> printf("aio result: %s\n", buf);
> return NULL;
> }
> 
> int main()
> {
> pthread_t thread;
> pthread_create(&thread, NULL, &thread2, NULL);
> 
> if (fork() > 0) // Thread 1
> {
> pthread_join(thread, NULL);
> printf("aio complete");
> }
> 
> }
> 
> Here we have two threads. The main thread is scheduling an asynchronous 
> io and the other is calling fork().
> 
> If we look at what the main thread is doing:
> 
> * After calling pthread_create, the fork() call goes here, in fork.c.
> * Before calling _Fork(), this thread will acquire the __malloc_lock 
> (through __malloc_atfork()),
> * Then _Fork() is called, which calls __aio_atfork(), which waits for 
> the maplock
> 
> In the meantime, the second thread will:
> 
> * Call aio_read(), which calls submit(), in aio.c
> * submit() then calls __aio_get_queue(), which acquires the maplock
> * At this point the map structure needs to be allocated so submit() 
> calls calloc()
> * calloc() calls malloc(), in malloc.c
> * If the allocation overflows, malloc() then calls wrlock(), which 
> waits for the __malloc_lock
> 
> So to summarize: Thread1 holds __malloc_lock and waits for maplock, and 
> thread2 holds maplock and waits for ___malloc_lock. We have a deadlock.
> 
> This is my understanding of the issue, but I have a (very) limited 
> knowledge of MUSL's codebase, so let me know if it's correct.
> 

Yes. Please see https://www.openwall.com/lists/musl/2022/10/06/2 and the 
following discussion.

Alexey

> If it is, I think this could be solved by  moving __aio_atfork() from 
> _Fork() to fork(), before __malloc_atfork() so the locks are acquired 
> in the correct order to prevent the deadlock.
> 
> I do see that _Fork() calls __block_all_sigs before calling 
> __aio_atfork() so I wonder if the order of these calls is important 
> though (let me know if that's the case).
> 
> If this solution makes sense to you, I'm happy to submit a contribution 
> with the fix.
> 
> Thank you,
> 
> --
> 
> Pierre Boulay

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.