Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260513135627.GN1827@brightrain.aerifal.cx>
Date: Wed, 13 May 2026 09:56:27 -0400
From: Rich Felker <dalias@...c.org>
To: Florian Schmaus <florian.schmaus@...asip.com>, musl@...ts.openwall.com,
	tg@...bsd.de
Subject: Re: [PATCH v2] qsort: align 'tmp' buffer to preserve CHERI
 capabilities

On Wed, May 13, 2026 at 03:35:08PM +0200, Szabolcs Nagy wrote:
> * Florian Schmaus <florian.schmaus@...asip.com> [2026-05-13 10:21:46 +0200]:
> > The cycle() function in qsort.c uses a local stack buffer 'tmp'
> > to temporarily hold elements being permuted.
> > 
> > Under CHERI architectures, like CHERI RISC-V, pointers are represented
> > as capabilities, which include an out-of-band hardware tag bit used to
> > enforce memory safety. For this tag bit to be preserved when a
> > capability is written to memory, the target address must be strictly
> > aligned to the capability's size requirement (e.g., 16-byte aligned
> > for 128-bit capabilities).
> > 
> > Previously, 'tmp' was declared as an 'unsigned char' array, which only
> > guarantees a 1-byte alignment. If qsort is called to sort an array of
> > pointers, writing these capabilities into an unaligned 'tmp' buffer
> > may cause the processor to strip their validity tags (depending on
> > the actual alignment at run-time). When these untagged capabilities
> > are copied back out and later dereferenced, the hardware will throw a
> > capability fault, crashing the program.
> > 
> > By changing the buffer type to a union with a pointer member we force
> > the compiler to align the stack allocation to the architectural
> > pointer/capability alignment boundary. This ensures that capabilities
> > stored in the buffer retain their tags and remain valid.
> > ---
> > Changes in v2:
> > - Use union to guarantee proper alignment
> > 
> >  src/stdlib/qsort.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/stdlib/qsort.c b/src/stdlib/qsort.c
> > index 28607450b885..ffd523f7cd8c 100644
> > --- a/src/stdlib/qsort.c
> > +++ b/src/stdlib/qsort.c
> > @@ -44,7 +44,12 @@ static inline int pntz(size_t p[2]) {
> >  
> >  static void cycle(size_t width, unsigned char* ar[], int n)
> >  {
> > -	unsigned char tmp[256];
> > +	/* Union forces pointer alignment to preserve CHERI capability tags */
> > +	union {
> > +		unsigned char c[256];
> > +		void *p;
> > +	} tmp_u;
> > +	unsigned char *tmp = tmp_u.c;
> >  	size_t l;
> >  	int i;
> 
> i guess this can be part of a cheri target port patchset.
> 
> cheri c is incompatible with standard c that allows
> byte-by-byte copy of pointers. in cheri memcpy this can
> be worked around if dst,src,size are aligned, but musl
> does not have such a memcpy.
> 
> i think without cheri target support such changes need
> stronger justification: either the behaviour improves
> (e.g. memcpy may work better if tmp is aligned) or the
> code quality improves (e.g. makes the intention clearer).

Yes, without CHERI being a supported target (and unfortunately CHERI
can't really support the full C language which is part of being
in-scope for musl targets) I don't think this is a justified upstream
change on the basis of CHERI alone, at least not for very
musl-internal code.

However qsort is in principle "pure library" code you can copy and use
outside of musl, so I think it's more defensible here. And the
alignment probably helps codegen especially if we ever enable
__builtin_memcpy to be used.

I think I'd be happy with switching to the union for alignment
purposes, with the commit message documenting that this may improve
codegen in the future and facilitates reuse of the code on CHERI.
Despite my disappointments about it not being fully compatible with C,
CHERI is a cool project and it's nice that they're able to make this
work.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.