Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 1 Sep 2023 09:57:34 -0400
From: Rich Felker <dalias@...c.org>
To: Natanael Copa <ncopa@...inelinux.org>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] add close_range() syscall wrapper

On Fri, Sep 01, 2023 at 10:02:00AM +0200, Natanael Copa wrote:
> close_range() is a syscall present in FreeBSD 8.0 and Linux 5.9. glibc
> 2.34 added a wrapper.
> 
> Expose it under _GNU_SOURCE similar to what GNU libc does. Also expose
> it under _BSD_SOURCE since it is also a FreeBSD function.
> ---
> 
> This is a re-take of close_range() syscall wrapper. It was previously
> discussed in mailing list: https://www.openwall.com/lists/musl/2022/08/18/5
> 
> Difference from previous submission:
> 
> - Use correct values for CLOSE_RANGE_UNSHARE and CLOSE_RANGE_CLOEXEC.
> - Set errno on errors.
> - Drop the (unsigned int) cast for flags, as it raised questions last
>   time.
> 
> I think it is a good idea to add this to musl because it is difficult to
> the close before exec properly without it. 
> 
> Most workaounrds currently out there are either parsing /proc or try to
> close everything to maxfd reported by getrlimit(), sysconf() or
> getdtablesize().
> 
> opendir("/proc/self/fd") is problematic becase 1) /proc may not be
> mounted and 2) some versions of musl hangs on malloc between fork and 
> exec.
> 
> Trying to close everything between a maxfd is also problematic. On some
> systems (under docker for example) the maxfd can be 1G, which
> effectively results in a hang. (See https://github.com/k0sproject/k0s/pull/3436)
> 
> I think its better to encourage the use of close_range().
> 
> See also: https://github.com/OpenRC/openrc/pull/645
> 
>  include/unistd.h        | 3 +++
>  src/linux/close_range.c | 8 ++++++++
>  2 files changed, 11 insertions(+)
>  create mode 100644 src/linux/close_range.c
> 
> diff --git a/include/unistd.h b/include/unistd.h
> index 5bc7f798..d89e3d4c 100644
> --- a/include/unistd.h
> +++ b/include/unistd.h
> @@ -161,6 +161,9 @@ unsigned ualarm(unsigned, unsigned);
>  #define L_INCR 1
>  #define L_XTND 2
>  int brk(void *);
> +#define CLOSE_RANGE_UNSHARE	(1U << 1)
> +#define CLOSE_RANGE_CLOEXEC	(1U << 2)
> +int close_range(unsigned int, unsigned int, int);
>  void *sbrk(intptr_t);
>  pid_t vfork(void);
>  int vhangup(void);
> diff --git a/src/linux/close_range.c b/src/linux/close_range.c
> new file mode 100644
> index 00000000..258ba8bd
> --- /dev/null
> +++ b/src/linux/close_range.c
> @@ -0,0 +1,8 @@
> +#define _GNU_SOURCE
> +#include <unistd.h>
> +#include "syscall.h"
> +
> +int close_range(unsigned int first, unsigned int last, int flags)
> +{
> +	return __syscall_ret(syscall(SYS_close_range, first, last, flags));
> +}
> -- 
> 2.42.0

This is double-processing errno. You need either return
__syscall_ret(__syscall(...)) (note the second __) or just return
syscall(...) (the syscall macro without __ automatically does the
__syscall_ret).

Aside from that, I think there's a question whether, if we support
this as a function rather than leaving it to the application to use
the syscall, we should provide a fallback for ENOSYS. I'm not sure,
but it's something that should be considered before adding it.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.