[svsm-devel] EDK2 CAA Page Fragmented Allocation
Gerd Hoffmann
kraxel at redhat.com
Mon May 19 17:30:17 CEST 2025
Hi,
> I've been debugging a boot failure on VMs running under Coconut-SVSM
> with 120 or more vCPUs. The guest kernel (on several distros) is
> hanging while trying to allocate a 256MiB memblock out of "low" memory
> for the SWIOTLB.
>
> While debugging this, I noticed that the memory map reported by EDK2
> has a ton of entries. There are the expected reserved regions, but
> then there's a section of about 2x #vcpus where there is 1 reserved
> and 1 usable page alternating. It turns out the reserved pages are
> allocated as CAA pages and then usable pages are allocated (and then
> freed) as VMSA pages at this[1] point in EDK2. Note that while these
> pages are allocated many times over the various UEFI phases, it's only
> the CAA pages from the first allocation that are getting leaked.
Doesn't match what I'm seeing here, using latest upstream edk2.
edk2 goes allocate three pages (when running under svsm). Usually it
will free the first, use the second as vmsa and the third as caa. In
case the vmsa page happens to be 2M aligned it'll instead use the first
as vmsa, second as caa and free the third to workaround a processor bug.
Due to top-down allocation this packs the pages next to each other, I
see a single reserved memory block with two pages per processor (except
boot processor).
> I had a few questions that I thought someone here might know the answer to:
> 1. Is UEFI supposed to keep these CAA pages allocated? I believe that
> UEFI is supposed to be able to talk to the TPM post-ExitBootServices
> and that would likely require CAA pages, but I might be missing
> something.
Well. Linux relocates the CAA page, and I don't think there is some way
for UEFI runtime services to use that. First because UEFI can't figure
the address of the Linux CAA page. And second because linux wouldn't
map the CAA page into the efi runtime service sandbox anyway.
UEFI runtime services having their own CAA page doesn't work either
because the CAA page address is registered in SVSM, and once the linux
kernel started using its own CAA page the UEFI CAA page stops working.
For the TPM this should not be an issue. There are no TPM runtime
services, the linux kernel talks to the (v)TPM using it's own driver.
For the UEFI variable service I'm working on this /is/ a problem though.
I simply can't do SVSM protocol calls after ExitBootService. I guess my
options are:
(a) Find some other way to call into svsm.
- Is it possible to emulate MMIO devices by letting page access
trap into SVSM?
(b) Write a linux driver for svsm efi variable access.
Hints or opinions anyone?
> 2. How are the other CAA pages being freed? The commit that adds the
> allocation [2] does not add any corresponding frees
No clue. Apparently the pages are allocated twice, once in PEI phase,
once in DXE phase, and only the PEI phase allocations stick. I had
expected both stay because they are allocated as Reserved (so they are
not freed automatically at ExitBootService) and -- as you already
figured -- there is no explicit free call.
take care,
Gerd
More information about the Svsm-devel
mailing list