kernel-hardening - Re: Linux guest kernel threat model for Confidential Computing

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230202145154.GA10621@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
Date: Thu, 2 Feb 2023 06:51:54 -0800
From: Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>
To: "Reshetova, Elena" <elena.reshetova@...el.com>
Cc: "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
	Leon Romanovsky <leon@...nel.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Shishkin, Alexander" <alexander.shishkin@...el.com>,
	"Shutemov, Kirill" <kirill.shutemov@...el.com>,
	"Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@...el.com>,
	"Kleen, Andi" <andi.kleen@...el.com>,
	"Hansen, Dave" <dave.hansen@...el.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	"Wunner, Lukas" <lukas.wunner@...el.com>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	Jason Wang <jasowang@...hat.com>,
	"Poimboe, Josh" <jpoimboe@...hat.com>,
	"aarcange@...hat.com" <aarcange@...hat.com>,
	Cfir Cohen <cfir@...gle.com>, Marc Orr <marcorr@...gle.com>,
	"jbachmann@...gle.com" <jbachmann@...gle.com>,
	"pgonda@...gle.com" <pgonda@...gle.com>,
	"keescook@...omium.org" <keescook@...omium.org>,
	James Morris <jmorris@...ei.org>,
	Michael Kelley <mikelley@...rosoft.com>,
	"Lange, Jon" <jlange@...rosoft.com>,
	"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>
Subject: Re: Linux guest kernel threat model for Confidential Computing

On Tue, Jan 31, 2023 at 11:31:28AM +0000, Reshetova, Elena wrote:
> > On Mon, 2023-01-30 at 07:42 +0000, Reshetova, Elena wrote:
> > [...]
> > > > The big threat from most devices (including the thunderbolt
> > > > classes) is that they can DMA all over memory.  However, this isn't
> > > > really a threat in CC (well until PCI becomes able to do encrypted
> > > > DMA) because the device has specific unencrypted buffers set aside
> > > > for the expected DMA. If it writes outside that CC integrity will
> > > > detect it and if it reads outside that it gets unintelligible
> > > > ciphertext.  So we're left with the device trying to trick secrets
> > > > out of us by returning unexpected data.
> > >
> > > Yes, by supplying the input that hasn’t been expected. This is
> > > exactly the case we were trying to fix here for example:
> > > https://lore.kernel.org/all/20230119170633.40944-2-
> > alexander.shishkin@...ux.intel.com/
> > > I do agree that this case is less severe when others where memory
> > > corruption/buffer overrun can happen, like here:
> > > https://lore.kernel.org/all/20230119135721.83345-6-
> > alexander.shishkin@...ux.intel.com/
> > > But we are trying to fix all issues we see now (prioritizing the
> > > second ones though).
> > 
> > I don't see how MSI table sizing is a bug in the category we've
> > defined.  The very text of the changelog says "resulting in a kernel
> > page fault in pci_write_msg_msix()."  which is a crash, which I thought
> > we were agreeing was out of scope for CC attacks?
> 
> As I said this is an example of a crash and on the first look
> might not lead to the exploitable condition (albeit attackers are creative).
> But we noticed this one while fuzzing and it was common enough
> that prevented fuzzer going deeper into the virtio devices driver fuzzing.
> The core PCI/MSI doesn’t seem to have that many easily triggerable 
> Other examples in virtio patchset are more severe. 
> 
> > 
> > > >
> > > > If I set this as the problem, verifying device correct operation is
> > > > a possible solution (albeit hugely expensive) but there are likely
> > > > many other cheaper ways to defeat or detect a device trying to
> > > > trick us into revealing something.
> > >
> > > What do you have in mind here for the actual devices we need to
> > > enable for CC cases?
> > 
> > Well, the most dangerous devices seem to be the virtio set a CC system
> > will rely on to boot up.  After that, there are other ways (like SPDM)
> > to verify a real PCI device is on the other end of the transaction.
> 
> Yes, it the future, but not yet. Other vendors will not necessary be 
> using virtio devices at this point, so we will have non-virtio and not
> CC enabled devices that we want to securely add to the guest.
> 
> > 
> > > We have been using here a combination of extensive fuzzing and static
> > > code analysis.
> > 
> > by fuzzing, I assume you mean fuzzing from the PCI configuration space?
> > Firstly I'm not so sure how useful a tool fuzzing is if we take Oopses
> > off the table because fuzzing primarily triggers those
> 
> If you enable memory sanitizers you can detect more server conditions like
> out of bounds accesses and such. I think given that we have a way to 
> verify that fuzzing is reaching the code locations we want it to reach, it
> can be pretty effective method to find at least low-hanging bugs. And these
> will be the bugs that most of the attackers will go after at the first place. 
> But of course it is not a formal verification of any kind.
> 
>  so its hard to
> > see what else it could detect given the signal will be smothered by
> > oopses and secondly I think the PCI interface is likely the wrong place
> > to begin and you should probably begin on the virtio bus and the
> > hypervisor generated configuration space.
> 
> This is exactly what we do. We don’t fuzz from the PCI config space,
> we supply inputs from the host/vmm via the legitimate interfaces that it can 
> inject them to the guest: whenever guest requests a pci config space
> (which is controlled by host/hypervisor as you said) read operation, 
> it gets input injected by the kafl fuzzer.  Same for other interfaces that 
> are under control of host/VMM (MSRs, port IO, MMIO, anything that goes
> via #VE handler in our case). When it comes to virtio, we employ 
> two different fuzzing techniques: directly injecting kafl fuzz input when
> virtio core or virtio drivers gets the data received from the host 
> (via injecting input in functions virtio16/32/64_to_cpu and others) and 
> directly fuzzing DMA memory pages using kfx fuzzer. 
> More information can be found in https://intel.github.io/ccc-linux-guest-hardening-docs/tdx-guest-hardening.html#td-guest-fuzzing
> 
> Best Regards,
> Elena.

Hi Elena,

I think it might be a good idea to narrow down a configuration that *can*
reasonably be hardened to be suitable for confidential computing, before
proceeding with fuzzing. Eg. a lot of time was spent discussing PCI devices
in the context of virtualization, but what about taking PCI out of scope
completely by switching to virtio-mmio devices?

Jeremi
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.