|
Message-ID: <20200722141404.jfzfl3alpyw7o7dw@steredhat> Date: Wed, 22 Jul 2020 16:14:04 +0200 From: Stefano Garzarella <sgarzare@...hat.com> To: Daurnimator <quae@...rnimator.com> Cc: Jens Axboe <axboe@...nel.dk>, Alexander Viro <viro@...iv.linux.org.uk>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, Kees Cook <keescook@...omium.org>, Aleksa Sarai <asarai@...e.de>, Stefan Hajnoczi <stefanha@...hat.com>, Christian Brauner <christian.brauner@...ntu.com>, Sargun Dhillon <sargun@...gun.me>, Jann Horn <jannh@...gle.com>, io-uring <io-uring@...r.kernel.org>, linux-fsdevel@...r.kernel.org, Jeff Moyer <jmoyer@...hat.com>, linux-kernel@...r.kernel.org Subject: Re: [PATCH RFC v2 2/3] io_uring: add IOURING_REGISTER_RESTRICTIONS opcode On Wed, Jul 22, 2020 at 12:35:15PM +1000, Daurnimator wrote: > On Wed, 22 Jul 2020 at 03:11, Jens Axboe <axboe@...nel.dk> wrote: > > > > On 7/21/20 4:40 AM, Stefano Garzarella wrote: > > > On Thu, Jul 16, 2020 at 03:26:51PM -0600, Jens Axboe wrote: > > >> On 7/16/20 6:48 AM, Stefano Garzarella wrote: > > >>> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h > > >>> index efc50bd0af34..0774d5382c65 100644 > > >>> --- a/include/uapi/linux/io_uring.h > > >>> +++ b/include/uapi/linux/io_uring.h > > >>> @@ -265,6 +265,7 @@ enum { > > >>> IORING_REGISTER_PROBE, > > >>> IORING_REGISTER_PERSONALITY, > > >>> IORING_UNREGISTER_PERSONALITY, > > >>> + IORING_REGISTER_RESTRICTIONS, > > >>> > > >>> /* this goes last */ > > >>> IORING_REGISTER_LAST > > >>> @@ -293,4 +294,30 @@ struct io_uring_probe { > > >>> struct io_uring_probe_op ops[0]; > > >>> }; > > >>> > > >>> +struct io_uring_restriction { > > >>> + __u16 opcode; > > >>> + union { > > >>> + __u8 register_op; /* IORING_RESTRICTION_REGISTER_OP */ > > >>> + __u8 sqe_op; /* IORING_RESTRICTION_SQE_OP */ > > >>> + }; > > >>> + __u8 resv; > > >>> + __u32 resv2[3]; > > >>> +}; > > >>> + > > >>> +/* > > >>> + * io_uring_restriction->opcode values > > >>> + */ > > >>> +enum { > > >>> + /* Allow an io_uring_register(2) opcode */ > > >>> + IORING_RESTRICTION_REGISTER_OP, > > >>> + > > >>> + /* Allow an sqe opcode */ > > >>> + IORING_RESTRICTION_SQE_OP, > > >>> + > > >>> + /* Only allow fixed files */ > > >>> + IORING_RESTRICTION_FIXED_FILES_ONLY, > > >>> + > > >>> + IORING_RESTRICTION_LAST > > >>> +}; > > >>> + > > >> > > >> Not sure I totally love this API. Maybe it'd be cleaner to have separate > > >> ops for this, instead of muxing it like this. One for registering op > > >> code restrictions, and one for disallowing other parts (like fixed > > >> files, etc). > > >> > > >> I think that would look a lot cleaner than the above. > > >> > > > > > > Talking with Stefan, an alternative, maybe more near to your suggestion, > > > would be to remove the 'struct io_uring_restriction' and add the > > > following register ops: > > > > > > /* Allow an sqe opcode */ > > > IORING_REGISTER_RESTRICTION_SQE_OP > > > > > > /* Allow an io_uring_register(2) opcode */ > > > IORING_REGISTER_RESTRICTION_REG_OP > > > > > > /* Register IORING_RESTRICTION_* */ > > > IORING_REGISTER_RESTRICTION_OP > > > > > > > > > enum { > > > /* Only allow fixed files */ > > > IORING_RESTRICTION_FIXED_FILES_ONLY, > > > > > > IORING_RESTRICTION_LAST > > > } > > > > > > > > > We can also enable restriction only when the rings started, to avoid to > > > register IORING_REGISTER_ENABLE_RINGS opcode. Once rings are started, > > > the restrictions cannot be changed or disabled. > > > > My concerns are largely: > > > > 1) An API that's straight forward to use > > 2) Something that'll work with future changes > > > > The "allow these opcodes" is straightforward, and ditto for the register > > opcodes. The fixed file I guess is the odd one out. So if we need to > > disallow things in the future, we'll need to add a new restriction > > sub-op. Should this perhaps be "these flags must be set", and that could > > easily be augmented with "these flags must not be set"? > > > > -- > > Jens Axboe > > > > This is starting to sound a lot like seccomp filtering. > Perhaps we should go straight to adding a BPF hook that fires when > reading off the submission queue? > You're right. I e-mailed about that whit Kees Cook [1] and he agreed that the restrictions in io_uring should allow us to address some issues that with seccomp it's a bit difficult. For example: - different restrictions for different io_uring instances in the same process - limit SQEs to use only registered fds and buffers Maybe seccomp could take advantage of the restrictions to filter SQEs opcodes. Thanks, Stefano [1] https://lore.kernel.org/io-uring/202007160751.ED56C55@keescook/
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.