Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMrV8J57=Mrw4zdwWKTKwY7cRxgafEYqR1+8+pmh_6khEWdE6g@mail.gmail.com>
Date: Thu, 7 May 2026 14:27:50 -0400
From: Mohamed salem Eddah <medsalemeddah@...il.com>
To: Solar Designer <solar@...nwall.com>
Cc: oss-security@...ts.openwall.com, security@...nel.org, 
	"axboe@...nel.dk" <axboe@...nel.dk>
Subject: Re: CVE request: io_uring zcrx freelist OOB write

Alexander, Pavel,

First, an apology: I replied to Pavel's questions directly without CC'ing
oss-security. That was a mistake. Putting the reply on the
list now.


To Pavel's question yes, triggered. whitout involving a kernel modules

**Kernel:** 6.19.11+kali-amd64, built April 9 2026, pre-770594e
**Hardware:** mlx5 ConnectX-6 (real ZCRX NIC)
**Trigger:** pure userspace via `SIOCSIFFLAGS IFF_DOWN`, `CAP_NET_ADMIN`

The OOB fires during `page_pool_destroy()` when two paths both push to
the same freelist without a bounds check:

```
path A — ptr_ring drain:
  io_pp_zc_release_netmem() per queued niov → free_count++

path B — io_pp_zc_destroy() scrub:
  for each niov with uref_array[i] != 0:
    io_zcrx_return_niov() → free_count++   (no bounds check)
```

Niovs that land in both paths push `free_count` past `num_niovs`.
The write at `freelist[num_niovs]` goes 4 bytes past the end of the
kcalloc'd array into the adjacent slab object.

Disassembly from the live kernel via gdb+/proc/kcore confirms no bounds
check at either write site:

```asm
; io_zcrx_return_niov (pp==NULL / freelist path)
0xffffffffa7016890: mov eax, [rdx+0x44]    ; eax = free_count
0xffffffffa7016893: mov r8,  [rdx+0x48]    ; r8  = freelist ptr
0xffffffffa7016897: lea edi, [rax+1]
0xffffffffa701689a: mov [rdx+0x44], edi    ; free_count++ (unconditional)
0xffffffffa70168fb: mov [r8+rax*4], edi    ; freelist[old_count] = niov_idx
                                            ; OOB when rax == num_niovs

; io_pp_zc_release_netmem (ptr_ring drain callback)
0xffffffffa701795f: mov eax, [rbp+0x44]    ; eax = free_count
0xffffffffa7017962: mov rdx, [rbp+0x48]    ; rdx = freelist ptr
0xffffffffa701796a: lea ecx, [rax+1]
0xffffffffa701796d: mov [rbp+0x44], ecx    ; free_count++
0xffffffffa701797b: mov [rdx+rax*4], ebx   ; freelist[old_count] = niov_idx
```

770594e adds `WARN_ON_ONCE(free_count >= num_niovs)` + early return at
both sites. The OOB write is suppressed. The double-count condition still
occurs the second push is silently dropped.

The fix is not in any stable branch. Distributions shipping 6.15+ kernels
with `CONFIG_IO_URING_ZCRX=y` before April 21 are affected.

To Alexander's point on originality: I am not claiming to have found the
bug before 770594e was merged. My contribution is the documented trigger
path from pure userspace, the disassembly confirming the write sites, and
the stable backport request. The fix exists upstream but has not reached
any stable queue. That gap is what this report is about.

The blog post at ze3tar.github.io is mine.

Three open questions for Pavel if you have a moment:

1. Is the NIC-down path (SIOCSIFFLAGS -> page_pool_destroy) the scrub case
   you referred to in the original thread?

2. Does 770594e fully resolve the issue or is a follow-up planned to
   prevent the double-count at the source rather than at the write?

3. Does `IORING_REGISTER_ZCRX_IFQ` check `capable()` or `ns_capable()`?
   The latter would widen the attack surface on distributions with
   permissive user namespace policy

Backport request for 770594e to 6.15.y stands.

-- Mohamed


On Thu, May 7, 2026 at 1:48 PM Solar Designer <solar@...nwall.com> wrote:

> On Mon, May 04, 2026 at 07:02:30AM +0100, Pavel Begunkov wrote:
> > On 5/3/26 12:00, Mohamed salem Eddah wrote:
> > >I am reporting a security issue in the Linux kernel involving an
> > >out-of-bounds heap write in io_uring/zcrx.c.
> > >
> > >This issue appears to have been addressed in commit 770594e
> > >(“io_uring/zcrx: warn on freelist violations”, April 21, 2026),
> > >however it
> > >was not assigned a CVE and does not appear to have been included in a
> > >formal security advisory. As a result, multiple stable and downstream
> > >distribution kernels are still affected.
> > >------------------------------
> > >Vulnerability Summary
> > >
> > >*File:* io_uring/zcrx.c
> > >*Function:* io_zcrx_return_niov_freelist()
> > >*Introduced:* Linux 6.12 (initial ZCRX merge)
> >
> > FWIW, it was added IIRC in 6.15, but not 6.12
> >
> > >*Fixed upstream:* 770594e (Apr 21, 2026)
> > >*Status:* Fix not yet present in stable releases
> > Did you trigger the problem or the warning in a new kernel
> > without the attached modules? Which kernel version / hash
> > was it? There was a fix for the scrub case, but otherwise
> > don't immediately see how that can happen. I'll take a look.
>
> I only skimmed, but as far as I can tell Mohamed isn't the original
> finder of this issue and the report and PoCs are AI-generated, which
> could be why Mohamed is not communicating further.  It's becoming a
> trend - someone sends AI-generated report and doesn't communicate.
> Which doesn't mean the report is useless, but it does complicate its
> handling.
>
> Meanwhile, it looks like there's a blog post (by someone else? I am
> confused) on exploitation of this issue, with exploit files attached:
>
> https://ze3tar.github.io/post-zcrx.html
>
> Alexander
>

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.