Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140120074131.GA3529@brightrain.aerifal.cx>
Date: Mon, 20 Jan 2014 02:41:31 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Discussion of partly-invasive changes needed for x32 port

The x32 port is close to being at a point where it can be integrated,
but unfortunately (due in large part to laziness and lack of foresight
on the part of the kernel folks) there are several points of major
ugliness which we need to work out solutions to. There are pretty much
all "bikeshed" topics, so in a way I hate spending time on them, but
the decisions we make are going to have an impact on how well we can
keep the code from getting hideously ugly.

The basic issues:

1. System calls take 64-bit arguments, whereas the existing syscall
framework in musl (and on the kernel side for all other archs/abis)
uses long arguments.

The current internal syscall macros cast all arguments to long; this
allows accepting either pointers or integer types smaller than long
without the caller having to make any casts. However this does not
work for x32, where some syscalls actually need 64-bit arguments (e.g.
lseek). The following macro, which replaces the long cast on x32,
solves the problem:

#define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long long) (X)

On other archs that don't define their own __scc, __scc(X) would just
be defined as (long)(X), preserving the current behavior.

For normal syscalls that go via the inline functions, these inline
functions are defined in syscall_arch.h (arch-specific) where it's
easy to change them to use long long instead of long for their
argument types. However, for cancellable syscalls, they go through
__syscall_cp in src/thread/cancel_impl.c. The proposal so far is to
expose these internals from the internal syscall.h and syscall_arch.h
to cancel_impl.c. I'm not opposed to this, but if we go this way, the
type name should be self-documenting (e.g. klong, klong_t,
syscall_long_t, or something similar) rather than the current proposal
(__sca) that's non-obvious unless you go digging for it. Alernatively,
src/internal/syscall.h could pull in register_t from alltypes.h
(having internal headers use alltypes.h directly is not something we
do yet, but it could be done) and we could simply always use
register_t for these arguments.

2. The kernel interprets the tv_nsec field of struct timespec as a
kernel long (64-bit) rather than a userspace long (32-bit). glibc
handles this completely wrong (bug #16437), making it easy for them.
Doing it correctly is much harder; for all syscalls that take timespec
arguments, musl will have to copy the timespec to a temp structure
with the padding correctly sign-extended before passing it to the
kernel. The affected functions include at least: sigtimedwait, ppoll,
nanosleep, clock_settime, mq_timedsend, mq_timedreceive, pselect,
futimesns, utimesnsat, and everything using timed futex waits
(thankfully these all go through __timedwait in musl).

Adding the workaround code at every syscall point is ugly, and
possibly error-prone. An alternate possible solution is hacking up the
syscall macros in syscall_arch.h to detect "const struct timespec *"
arguments and auto-wrap them with compound literals that would fix-up
the padding. However this probably requires either C11 or GNU C
extensions. On the positive side, it would only affect x32; no changes
at all would be made to the source files or other archs'
syscall_arch.h logic.

An alternate version of this approach is to make a dedicated KTS()
(kernel timespec) macro that takes a struct timespec pointer and
returns a pointer to either the original argument (on sane archs) or a
compound literal copy of it with the sign-extension performed (on
x32). Then all callers would have to use this macro. But I don't like
this from a standpoint that it obfuscates the general code for the
sake of a single broken arch.

3. A number of other structures in the public headers differ from
their userspace C type layout on other archs because they're setup to
match the 64-bit kernel structs. This requires either changing some
member types to match the 64-bit kernel ones (this is usually a bad
idea, and sometimes even non-conforming like with tv_nsec) or adding
adjacent padding members for the unused "high" part of the 64-bit
value. In many cases, this requires either moving structures from the
shared headers to bits/*, or coming up with clever ways to avoid doing
so.

Most of the issues are in the sysv ipc headers:

- ipc.h: There's actually no need for bits/ipc.h now; the structure is
  identical on all archs. But it would be different on x32 unless the
  type of the padding members is changed to be something that's 64-bit
  on x32 but retains its size on all other archs (right now it's
  long). I'd like to remove this bits header.

- sem.h: The time_t members are currently misdeclared as long; this
  needs to be fixed regardless of anything else. Changing all the
  unused/padding to be time_t-sized would work everywhere, including
  x32, but it's semantically ugly, especially if we ever add targets
  where time_t is 64-bit but the 64-bit-padded fields aren't needed
  elsewhere.

- shm.h: The structs shminfo and shm_info are made up of syscall-long
  sized members, at least from a kernel standpoint. These should
  probably be userspace-long plus adjacent padding on the userspace
  side. Either way it looks like we'd need to move them to bits, which
  would be mildly ugly. But it might be possible to write the generic
  header in a way that computes the padding sizes based on sizes of
  other types, or just gets the padding from a macro in the bits
  header; this might even make it possible to unify the definition of
  shmid_ds into the shared header and nearly eliminate the bits
  headers.

-  msg.h: Similar to shm.h, extra padding members are needed.

All of these issues raise the question of whether it would be
beneficial to have available "syscall_long_t" or similar in the public
headers. This would make it very possible to avoid moving anything to
bits (and perhaps even helo move some things back) with some mild to
moderately ugly tricks for computed padding. But the cons are that
exposing the type invites abuse (application code trying to use it)
and the actual struct layouts perhaps become less explicit...

OK, this email is already long enough, so I'll wrap it up for now.
There still are some other x32 issues to discuss I think, but we can
follow up on those later.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.