Bug 1838633 - UEFI HTTP out of memory error when booting larger LiveCD
Summary: UEFI HTTP out of memory error when booting larger LiveCD
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-21 13:22 UTC by Lukas Zapletal
Modified: 2020-12-21 17:18 UTC (History)
3 users (show)

Fixed In Version: grub2-2.04-19.fc32
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-29 04:09:24 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
OOM grub error (2.64 KB, image/png)
2020-05-21 13:22 UTC, Lukas Zapletal
no flags Details
OOM kernel error (30.62 KB, image/png)
2020-05-21 13:23 UTC, Lukas Zapletal
no flags Details
Screenshot from MEN server (Intel Atom) (104.56 KB, image/jpeg)
2020-05-25 10:43 UTC, Lukas Zapletal
no flags Details
Screen from libvirt with debug=all (16.31 KB, image/png)
2020-05-25 10:49 UTC, Lukas Zapletal
no flags Details

Description Lukas Zapletal 2020-05-21 13:22:36 UTC
Created attachment 1690644 [details]
OOM grub error

Hello,

we have a customer who would like to do UEFI HTTP Boot over VLAN (tagging). Unfortunately, this does not work. Grub prints "Out of memory" error for a moment (see the attachment) and then kernel prints out an error about not being able to open up root device.

This is booting a livecd over UEFI HTTP boot in libvirt VM for the record with VLAN id 13 set in EFI firmware.

  menuentry 'Foreman Discovery Image EFI' --id discovery {
    linuxefi boot/fdi-image/vmlinuz0 rootflags=loop root=live:/fdi.iso rootfstype=auto ro rd.live.image acpi=force rd.luks=0 rd.md=0 rd.dm=0 rd.lvm=0 rd.bootif=0 rd.neednet=0 nokaslr nomodeset proxy.url=https://sat68.nat.lan proxy.type=foreman BOOTIF=01-$net_default_mac fdi.vlan=13
    initrdefi boot/fdi-image/initrd0.img
  }

I am using latest grub2 from Fedora:

https://koji.fedoraproject.org/koji/buildinfo?buildID=1509008

Comment 1 Lukas Zapletal 2020-05-21 13:23:30 UTC
Created attachment 1690645 [details]
OOM kernel error

Comment 2 Lukas Zapletal 2020-05-21 13:30:29 UTC
I am seeing the same behaviour on native (not tagged) network as well.

Comment 3 Javier Martinez Canillas 2020-05-21 14:19:20 UTC
Hello Lukas,

Could you please set debug=all to get more information on where this out of memory is happening?

Comment 4 Lukas Zapletal 2020-05-25 10:32:36 UTC
Hello Javier,

I was on call with a customer who tried the same on their hardware with the same result. We tried with

set debug="http,efinet,net"

unfortunately there is not much logged. Attaching the screenshot from their hardware but it reads the same. When we tried with debug=all it was never ending and it was rolling for hour, then the error appears, "Press a key to continue" actually just waits few seconds and then it tries to boot the system. So we are not able to capture reliably anything for you. Please advice how to capture the debug output, maybe I could create a video and then seek in the recording. Or maybe if you can give me an option to prevent grub from booting when this error appears.

You can probably reproduce this in libvirt too, you just need to boot big enough live CD. You can use Fedora or in our case Discovery Image which is 300 MB sized RHEL7/CentOS7 created with livecd-creator: http://downloads.theforeman.org/discovery/nightly/fdi-image-latest.tar

Comment 5 Lukas Zapletal 2020-05-25 10:43:29 UTC
Created attachment 1691896 [details]
Screenshot from MEN server (Intel Atom)

Comment 6 Lukas Zapletal 2020-05-25 10:49:24 UTC
Created attachment 1691898 [details]
Screen from libvirt with debug=all

Comment 7 Javier Martinez Canillas 2020-05-25 16:28:30 UTC
I'm changing this to Fedora 32 since is a regression in GRUB 2.04. It doesn't affect RHEL7 and RHEL8.

Comment 8 Javier Martinez Canillas 2020-05-25 16:35:34 UTC
The memory allocation failure happens in the verifiers framework, because it reads the files to be verified as a single chunk when passing to the verifiers modules (i.e: the tpm module). So the issue happens when a initrd image is large (in Lukas' test the initrd size is 237 MiB) and GRUB tries to verify it.

By default GRUB request a quarter of the memory available system for its heap, which is enough for most cases but it is not when using large initrd images and the verifiers framework. One option could be to change GRUB's default to request a bigger size of the available memory for the heap.

Comment 9 Javier Martinez Canillas 2020-05-25 18:17:33 UTC
I found the issue and is that now that the tpm module is built-in the EFI binary, GRUB is allocating two buffers to read the initrd image.

One buffer is allocated in the linux EFI loader and used as a bounce buffer because some machines aren't able to DMA above 4GB during EFI.
The other buffer is allocated by the verifiers framework as mentioned in Comment 8, to read the file and pass it as a single chunk to the
tpm module and other modules using the verifiers API.

Since the initrd image is quite big, there isn't enough memory in the heap to allocate two buffers of that size. But when using the verifiers
framework, the read operation is just a memory copy from the buffer that was used to read the file in the verifiers open handler, so there
is no need for a bounce buffer in the linux EFI loader anymore.

Comment 10 Fedora Update System 2020-05-26 16:44:27 UTC
FEDORA-2020-193b04db8e has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-193b04db8e

Comment 11 Fedora Update System 2020-05-27 02:21:27 UTC
FEDORA-2020-193b04db8e has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-193b04db8e`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-193b04db8e

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 12 Fedora Update System 2020-05-29 04:09:24 UTC
FEDORA-2020-193b04db8e has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.