Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230411124822.GK3630668@port70.net>
Date: Tue, 11 Apr 2023 14:48:22 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: 张飞 <zhangfei@...iscas.ac.cn>
Cc: musl@...ts.openwall.com
Subject: Re: Re: [PATCH]Implementation of strlen function in riscv64
 architecture

* 张飞 <zhangfei@...iscas.ac.cn> [2023-04-10 13:59:22 +0800]:
> I have made modifications to the assembly implementation of the riscv64 strlen function, mainly 
> focusing on address alignment processing to avoid the problem of data crossing 
> pages during vector instruction memory access.
> 
> I think the assembly implementation of strlen is necessary. In glibc, 

if the c definition is not correct then you have to explain why.
if it's very slow then please tell us so.

> X86_64, aarch64, alpha, and others all have assembly implementations of this function, 
> while for riscv64, it is blank.
> I have also analyzed the test sets of Spec2006 and Spec2017, and the strlen function is also a hot topic.

an asm implementation has significant maintenance cost so you should
provide some benchmark data or other evidence/reasoning for us to
decide if it's worth the cost.

it seems you replaced the c strlen code with a slower one except when
musl is built for "#ifdef __riscv_vector" isa extension. what cpus
does this affect? are linux distros expected to use this as baseline?
do different riscv cpus have similar simd performance properties? who
will tweak the asm if not?

in principle what you did can be done by the compiler auto vectorizer
so maybe contributing to the compiler is more useful.

note that glibc has cpu specific implementations that it can select
at runtime, but musl uses one generic implementation for all cpus.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.