Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200430235109.GJ21576@brightrain.aerifal.cx>
Date: Thu, 30 Apr 2020 19:51:09 -0400
From: Rich Felker <dalias@...c.org>
To: sidneym@...eaurora.org
Cc: musl@...ts.openwall.com
Subject: Re: Hexagon DSP support

On Thu, Apr 30, 2020 at 05:44:07PM -0500, sidneym@...eaurora.org wrote:
> > -----Original Message-----
> > From: 'Rich Felker' <dalias@...c.org>
> > Sent: Wednesday, April 15, 2020 2:30 PM
> > To: sidneym@...eaurora.org
> > Cc: musl@...ts.openwall.com
> > Subject: Re: [musl] Hexagon DSP support
> > 
> > On Wed, Apr 15, 2020 at 02:12:12PM -0500, sidneym@...eaurora.org wrote:
> > > src/api/ftw.c:44:1: error: switch condition has boolean value
> > > [-Werror,-Wswitch-bool]
> > 
> > Compiling libc-test with -Werror is invalid. It necessarily must produce
> some
> > warnings, and needs to be able to distinguish actual errors from them.
> > 
> > > 8 errors generated.
> > > src/functional/dlopen.c:39: dlsym main failed: Symbol not found: main
> > > FAIL src/functional/dlopen.exe [status 1]
> > > clang-11: warning: argument unused during compilation: '-rdynamic'
> > > [-Wunused-command-line-argument]
> > 
> > This is a clang deficiency that's preventing the test from working.
> > 
> > > src/functional/ipc_msg.c:62: qid_ds.msg_lspid == 0 failed: got 100,
> > > want 0
> > > src/functional/ipc_msg.c:64: (long long)qid_ds.msg_stime == 0 failed:
> > > got 6815993655711498240, want 0
> > > src/functional/ipc_msg.c:67: qid_ds.msg_ctime >= t failed: got
> > > 70368744177664, want >= 154502684032321054
> > > src/functional/ipc_msg.c:71: qid_ds.msg_qbytes > 0 failed: got 0, want
> > > > 0
> > > src/functional/ipc_msg.c:76: qid_ds.msg_qnum == 1 failed: got 0, want
> > > 1
> > > src/functional/ipc_msg.c:77: qid_ds.msg_lspid == getpid() failed: got
> > > 100, want 185819
> > > src/functional/ipc_msg.c:81: msg_stime is 6815993655711498240 want <=
> > > 154502684032321059
> > > src/functional/ipc_msg.c:128: child exit status: 256 FAIL
> > > src/functional/ipc_msg-static.exe [status 1]
> > 
> > This is the sysvipc bits headers issue I mentioned. Same with the sem/shm
> > ones.
> 
> Passing with updated headers and qemu.

Looking at https://github.com/quic/musl/commit/b710ae08dfa8c0841abed44d55eee5e042321bd6
there are new problems:

- Member mode in struct ipc_perm has wrong type. It must be mode_t.
  This can be fixed just by removing the padding. But is there a
  reason you're not using the generic arch-agnostic definition on both
  the kernel and libc side?

- The definitions in bits/msg.h, bits/sem.h, and bits/shm.h are not
  time64-compatible. In fact bits/sem.h is a regression; you had it
  right before and broke it.

- The definition in bits/stat.h is using the wrong types (you need to
  use libc type names for all the members; this is a policy
  requirement to ensure that none inadvertently mismatch). Note that
  there's not even really a need to use an arch-specific header here
  anymore; the kernel definition comes from arch/hexagon/kstat.h and
  the bits/stat.h can just be copied from aarch64 or something else
  with a clean layout.

Also math asm like sqrt.S is not acceptable for upstream.
Arch-specific math files should only be for use of fpu instructions
that perform the operation without the need for any high level flow
control.

> > > src/functional/pthread_mutex.c:145: PTHREAD_MUTEX_ERRORCHECK relock
> > > did not return EDEADLK, got deadlock
> > > src/functional/pthread_mutex.c:148: PTHREAD_MUTEX_RECURSIVE relock
> > did
> > > not succed, got deadlock FAIL src/functional/pthread_mutex-static.exe
> > > [status 1]
> > > src/functional/pthread_mutex.c:145: PTHREAD_MUTEX_ERRORCHECK relock
> > > did not return EDEADLK, got
> > > src/functional/pthread_mutex.c:148: PTHREAD_MUTEX_RECURSIVE relock
> > did
> > > not succed, got deadlock FAIL src/functional/pthread_mutex.exe [status
> > > 1]
> > 
> > This suggests broken atomics. It doesn't look like anything qemu-user
> could
> > cause (unless it's just failing to emulate the atomics at all).
> > 
> Updated QEMU to include time64 changes.  Passing now.

It should not be necessary to update qemu for time64 in order for this
to work. Any change you made to qemu to make this work is almost
certainly wrong. According to my reading of the upstream kernel, there
is an existing stable user-kernel syscall ABI for hexagon, so you
can't change any of the existing types or behavior of any of the
existing syscalls.

Looking at your port, I think I've found the real core problem. In:

https://github.com/quic/musl/blob/b710ae08dfa8c0841abed44d55eee5e042321bd6/arch/hexagon/bits/syscall.h.in

I don't see any of the time64 syscalls. Having them defined is
absolutely mandatory for an arch where the unadorned syscalls do not
use 64-bit time_t. If you add them and revert all the incorrect
changes, I think everything will work.

> > Looks like something is wrong here, possibly wrong struct kstat, or it
> could be a
> > qemu bug.
> 
> It seems like QEMU's TARGET_NR_utimensat needs to changes similar to those
> done for clock_get/settime to account for 64bit time_t.   I made some
> changes but the last 2 checks at 65 and 66 still fail.

No. It does not need any change, and any change to it is wrong. It is
implementing an existing ABI that's immutable and that's fine. What
needs to change is that your syscall.h.in needs to define the values
for the time64 syscalls. This will inform musl that the "plain"
syscalls use 32-bit time_t and get you a correct build. Right now you
just have a badly broken build where you've balanced brokenness in one
place with corresponding brokenness in other places until the right
thing comes out.

> > > src/math/ucb/sqrt.h:47: bad fp exception: RN
> > > sqrt(0x1.fffffffffffffp+1023)=0x1.fffffffffffffp+511, want INEXACT got
> > > 0 [...]
> > > src/math/special/sqrt.h:74: bad fp exception: RN
> > > sqrtl(0x1.9b294f88p-1015)=0x1.cad197e28e85bp-508, want INEXACT got 0
> > > FAIL src/math/sqrtl.exe [status 1]
> > 
> > These are likely all general (not arch-specific) clang floating point
> bugs. We
> > should see if they also happen compiling other archs without asm versions
> of
> > sqrt.
> 
> Assembly version of sqrt does pass.

Yes but writing a large complex function in asm is not reviewable or
maintainable.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.