Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200630161256.GA10755@pi3.com.pl>
Date: Tue, 30 Jun 2020 18:12:56 +0200
From: Adam Zabrocki <pi3@....com.pl>
To: lkrg-users@...ts.openwall.com
Subject: Re: p_install_arch_jump_label_transform_hook and
 p_check_integrity lead to deadlock issue on unisoc SL8541E

Hi,

I've synced with Ethan offline about that problem. It looks like in a very slow 
devices, some of the busy loop of locking/unlocking text_mutext is too tight 
and optimize_kprobe() can't win the race of getting text_mutex. I've just 
pushed a simple patch which helps to solve that problem:

https://bitbucket.org/Adam_pi3/lkrg-main/commits/ec595f555bcb9b81a1782d9e2c9651a8abf45aab

Thanks,
Adam

On Mon, Jun 29, 2020 at 10:12:50AM +0800, youyan wrote:
> Hi adam
> 
>    I have found the reason,why LKRG block on my device.
> 
>    1: hardware and software: unisoc SL8541E, android Q, kernel version is 4.14
> 
>    2: SL8541E is arm64 platform, but for some reason, we compile it as 32bit arm
> 
>    3: On function p_create_database(),which in the file p_database.c,have the fellow execute flow:
> 
>       (1) if (p_register_arch_metadata() != P_LKRG_SUCCESS) 
> [  125.693391] c1 [<c01e379c>] (kick_kprobe_optimizer) from [<c01e4394>] (optimize_kprobe+0x108/0x118)
> [  125.702389] c1 [<c01e4394>] (optimize_kprobe) from [<c01e5fe0>] (register_kprobe+0x548/0x5b0)
> [  125.710871] c1 [<c01e5fe0>] (register_kprobe) from [<c01e63ac>] (register_kretprobe+0x114/0x178)
> [  125.719671] c1 [<c01e63ac>] (register_kretprobe) from [<bf2d9f60>] (p_install_arch_jump_label_transform_hook+0x38/0xc0 [p_lkrg])
> [  125.731224] c1 [<bf2d9f60>] (p_install_arch_jump_label_transform_hook [p_lkrg]) from [<bf2d9b98>] (p_register_arch_metadata+0x74/0xd0 [p_lkrg])
>      (2) kick_kprobe_optimizer()->schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY)->kprobe_optimizer()->mutex_lock(&module_mutex)->do_optimize_kprobes()->mutex_lock(&text_mutex)
> 
>      (3) p_register_module_notifier();
> 
>    4: some notifier or timer can trigger p_check_integrity()
> 
>    5: p_module_event_notifier is execute after the p_register_module_notifier:
> 
>    p_module_event_notifier_live_retry:
> 
>          p_text_section_lock();
> 
>          /* We are heavily consuming module list here - take 'module_mutex' */
> 
>          //mutex_lock(&module_mutex);
> 
>          while (!mutex_trylock(&module_mutex)) {
> 
>             p_text_section_unlock();
> 
>             goto  p_module_event_notifier_live_retry;
> 
>          }  
> 
>    above will loop request exceute,and will consume a lot of cpu resource. It lead to do_optimize_kprobes can't get mutex_lock(&text_mutex);
> 
>    7:when p_check_integrity execute,it first execute  p_text_section_lock(),then mutex_lock(&module_mutex), and lead to deadlock.
> 
>     
> 
>    kprobe.c                                                   p_integrity_timer.c
> 
>    mutex_lock(&module_mutex)              
> 
>                                                                    p_text_section_lock()->mutex_lock(P_SYM(p_text_mutex));
> 
>    
> 
>                                                                    mutex_lock(&text_mutex)
> 
>    mutex_lock(&text_mutex)
> 
>    
> 
>    8:Maybe SL8541E run slowly cause this bug.
> 
>    9:I try fellow three solutions to fix this issue:
> 
>       solution one:
> 
>         Before p_register_module_notifier(),add a some delay use msleep();
> 
>       solution two:
> 
>           (1) p_module_event_notifier request mutex_trylock(&module_mutex) loop,add msleep(10):
> 
>         p_module_event_notifier_going_retry:
> 
>       p_text_section_lock();
> 
>       while (!mutex_trylock(&module_mutex)) {
> 
>   msleep(10);
> 
>          p_text_section_unlock();
> 
>          goto  p_module_event_notifier_going_retry;
> 
>       }
> 
>   (2)change request mutex order
> 
>       p_text_section_lock();                                 mutex_lock(&module_mutex);
> 
>                                      change to           
> 
>       mutex_lock(&module_mutex);                     p_text_section_lock();
> 
>       solution third:
> 
> p_text_section_lock(); p_check_integrity_mutex:
> 
>                                                                                             change to
> 
>        mutex_lock(&module_mutex);                p_text_section_lock();
> 
>                                                      while (!mutex_trylock(&module_mutex)) {
> 
>                                              p_text_section_unlock();
> 
>  msleep(10);
> 
> goto  p_check_integrity_mutex;
> 
>                          }
> 
>     10:I maybe use use solution one, because I am not familiar with LKRG. Which solution do you suggest? If you have better idea,could you share me? Thanks!!!
> 
> 
> 
> 
> 
>  

-- 
pi3 (pi3ki31ny) - pi3 (at) itsec pl
http://pi3.com.pl

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.