Bug 1572126 - allocation errors with secureboot grub version
Summary: allocation errors with secureboot grub version
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: grub2
Version: 28
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-26 09:02 UTC by Ferry
Modified: 2019-05-28 22:25 UTC (History)
4 users (show)

Fixed In Version: grub2-2.02-51.fc29 grub2-2.02-57.fc29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-05-28 22:25:02 UTC
Type: Bug


Attachments (Terms of Use)
Pictures of the 4k/15" screen when running lsmmap (670.19 KB, application/zip)
2018-06-18 15:02 UTC, Ferry
no flags Details
Proposed patch that may help the allocation problem (2.60 KB, patch)
2018-06-19 14:29 UTC, Peter Jones
no flags Details | Diff

Description Ferry 2018-04-26 09:02:02 UTC
Description of problem:

We use a rather large initrd, around 300-500MB for stateless systems. This is based on Fedora 27. Also tried with the current version in the repositories for 28, it gives the same error.

Versions tested (grub2-efi-x64):
2.02-22.fc27               
2.02-26.fc28

On a Dell XPS 15 9550 grub will state when loading initrd:

error: can't allocate initrd.

This is the first system we see this issue on, whilst we run this on a *lot* of different hardware configurations. One of the most notable differences is that this system actually has much more memory than usual. Most of the hardware has 4GB (this is our minimum requirement), occasionally there's hardware with 8GB. This system has 16GB provided by 2 * 8GB modules.


How reproducible:

Make a 300MB+ initrd (doubt the compression method matters, ours has this size after xz -9 compression, extracted it's ~1200MB), grab a stock Fedora kernel and try loading it from USB stick with grub2 (it does run below shim, for secureboot).

For example, make a FAT32 formatted USB stick, mount it at /mnt/t and then:

mkdir -p /mnt/t/EFI/BOOT
cp /boot/efi/EFI/fedora/shimx64.efi /mnt/t/EFI/BOOT/bootx64.efi
cp /boot/efi/EFI/fedora/grubx64.efi /mnt/t/EFI/BOOT/bootx64.efi
cp /boot/vmlinuz-4.15.17-300.fc27.x86_64 /mnt/t/vmlinuz
cp <300MB+ initrd> /mnt/t/initrd
cat > /mnt/t/EFI/BOOT/grub.cfg <<< EOF
echo Loading kernel to RAM
linuxefi /vmlinuz
echo Loading initrd to RAM
initrdefi /initrd
echo Attempting boot
boot
EOF

And attempt booting from the stick. The error will appear immediately after the 'Loading initrd to RAM' message. The kernel will start, but fails with an kernel panic about not being able to find the root file system. Which is to be expected as it can not find the initrd.

Please let me know if I can be of further assistance.

I've filed this against 28. Not sure what the best approach is here. It's applicable to 27 as well. If I remember correctly the issue also occurred with the grub2 from 25 or 26. That also gave an allocation error, but did have something with HIGH or HIGH mem in the message.

Comment 1 Ferry 2018-04-26 09:03:49 UTC
Correction:

cp /boot/efi/EFI/fedora/grubx64.efi /mnt/t/EFI/BOOT/bootx64.efi

should be

cp /boot/efi/EFI/fedora/grubx64.efi /mnt/t/EFI/BOOT/

Comment 2 Ferry 2018-04-26 09:09:16 UTC
FYI: Just tested with shim removed and secure boot disabled. Assuming the example above

mv /mnt/t/bootx64.efi /mnt/t/shimx64.efi
mv /mnt/t/grubx64.efi /mnt/t/bootx64.efi

and attempted boot.

Same error.

Comment 3 Ferry 2018-04-26 10:52:11 UTC
Just tested with a DIMM removed (tried twice, DIMM-A removed @1, DIMM-B removed @2), makes no difference.

Comment 4 Peter Jones 2018-06-12 16:12:56 UTC
Can you add a dump of the memory map before and after trying to load that initramfs?

Comment 5 Ferry 2018-06-13 10:59:22 UTC
Sure, but do you have some pointers for me on how to provide this?

The secure boot version is stripped from quite a few functions.

First things I found were

displaymem -> not found, seems replaced with lsmmap in grub2
lsmmap -> not found
dump -> requires me to specify an address (range?). Don't know what address(range) you require for this. Dumping 16GB byte for byte and typing it over seems quite painful :)

Thanks :)

Comment 6 Ferry 2018-06-18 15:01:04 UTC
Hi Peter,

created an efi image with grub-mkstandalone on my Gentoo system, as the secure boot version doesn't have lsmmap.

This is based on Gentoo's grub 2.02-r1 package. Messages differ a bit (says out of memory instead of allocation error).

What's also notable is that all grub versions I've tried showed me a 2TB disk. There is *no* 2TB disk in the system. The only thing I can imagine it's some RAMdisk, perhaps created by the firmware (or grub?). It's gone by the time linux has booted (doesn't show up in lsblk anyways).

FYI: The system has a 250GB M.2 SSD and a 500GB SATA SSD. Both from Samsung. There's also a USB stick plugged in, from which I boot. The 2TB disk remains surprising. Maybe it's used for firmware flashing or something, but 2TB seems rather insane for that purpose, especially since it's backed by only 16GB of RAM.

I had to type this over as this machine doesn't have serial and I don't have a magic XHCI cable :). Checked it several times, quite sure there's no typos. Will provide a photo too. 4k screen on 15" though, it's rather small :).

One thing I do find surprising is in the output of lsmmap. The addresses keep increasing, like they are in order, until:
base_addr = 0x100000000, length = 0x3be000000, available RAM

after which it jumps back to

base_addr = 0xa0000, length = 0x60000, reserved RAM

Not sure if this is normal behaviour from grub.


grub> insmod lsmmap
grub> lsmmap
base_addr = 0x0, length = 0x58000, available RAM
base_addr = 0x58000, length = 0x1000, reserved RAM
base_addr = 0x59000, length = 0x45000, available RAM
base_addr = 0x9e000, length = 0x2000, reserved RAM
base_addr = 0x100000, length = 0xb0cf000, available RAM
base_addr = 0xb1cf000, length = 0x40000, available RAM
base_addr = 0xb20f000, length = 0x13a89000, available RAM
base_addr = 0x1ec98000, length = 0xa465000, available RAM
base_addr = 0x290fd000, length = 0x525000, available RAM
base_addr = 0x29622000, length = 0x1af2000, available RAM
base_addr = 0x2b114000, length = 0x1000, ACPI non-valatile storage RAM
base_addr = 0x2b115000, length = 0x4a000, reserved RAM
base_addr = 0x2b15f000, length = 0x60000, available RAM
base_addr = 0x2b1bf000, length = 0x8262000, available RAM
base_addr = 0x33421000, length = 0x2f2f000, available RAM
base_addr = 0x36350000, length = 0x1d0000, available RAM
base_addr = 0x36520000, length = 0x741000, available RAM
base_addr = 0x36c61000, length = 0x3bb000, reserved RAM
base_addr = 0x3701c000, length = 0x3e000, ACPI reclaimable RAM
base_addr = 0x3705a000, length = 0x655000, ACPI non-volatile storage RAM
base_addr = 0x376af000, length = 0x2d7d000, reserved RAM
base_addr = 0x3a42c000, length = 0xd3000, RAM holding firmware code
base_addr = 0x3a4ff000, length = 0x1000, available RAM
base_addr = 0x100000000, length = 0x3be000000, available RAM
base_addr = 0xa0000, length = 0x60000, reserved RAM
base_addr = 0x3a500000, length = 0x5b00000, reserved RAM
base_addr = 0xe0000000, length = 0x10000000, reserved RAM
base_addr = 0xfe000000, length = 0x110000, reserved RAM
base_addr = 0xfec00000, length = 0x1000, reserved RAM
base_addr = 0xfee00000, length = 0x1000, reserved RAM
base_addr = 0xff000000, length = 0x1000000, reserved RAM
grub> ls
(memdisk) (hd0) (h1) (hd2) (hd3)
grub> insmod part_msdos
grub> ls
(memdisk) (hd0) (hd0,msdos1) (hd1) (hd2) (hd3)
grub> ls (hd0,msdos1)/
efi/ vmlinuz initrd <bunch of others>
grub> insmod linux
grub> linux (hd0,msdos1)/vmlinuz
grub> initrd (hd0,msdos1)/initrd
error: out of memory.
grub> lsmmap
base_addr = 0x0, length = 0x58000, available RAM
base_addr = 0x58000, length = 0x1000, reserved RAM
base_addr = 0x59000, length = 0x45000, available RAM
base_addr = 0x9e000, length = 0x2000, reserved RAM
base_addr = 0x100000, length = 0x1e60000, available RAM
base_addr = 0x1f60000, length = 0xa0000, available RAM
base_addr = 0x2000000, length = 0x1e60000, available RAM
base_addr = 0x3e60000, length = 0x736f000, available RAM

base_addr = 0xb1cf000, length = 0x40000, available RAM
base_addr = 0xb20f000, length = 0x13a89000, available RAM
base_addr = 0x1ec98000, length = 0xa465000, available RAM
base_addr = 0x290fd000, length = 0x525000, available RAM
base_addr = 0x29622000, length = 0x1af2000, available RAM
base_addr = 0x2b114000, length = 0x1000, ACPI non-valatile storage RAM
base_addr = 0x2b115000, length = 0x4a000, reserved RAM
base_addr = 0x2b15f000, length = 0x60000, available RAM
base_addr = 0x2b1bf000, length = 0x8262000, available RAM
base_addr = 0x33421000, length = 0x2f2f000, available RAM
base_addr = 0x36350000, length = 0x1d0000, available RAM
base_addr = 0x36520000, length = 0x741000, available RAM
base_addr = 0x36c61000, length = 0x3bb000, reserved RAM
base_addr = 0x3701c000, length = 0x3e000, ACPI reclaimable RAM
base_addr = 0x3705a000, length = 0x655000, ACPI non-volatile storage RAM
base_addr = 0x376af000, length = 0x2d7d000, reserved RAM
base_addr = 0x3a42c000, length = 0xd3000, RAM holding firmware code
base_addr = 0x3a4ff000, length = 0x1000, available RAM
base_addr = 0x100000000, length = 0x3be000000, available RAM
base_addr = 0xa0000, length = 0x60000, reserved RAM
base_addr = 0x3a500000, length = 0x5b00000, reserved RAM
base_addr = 0xe0000000, length = 0x10000000, reserved RAM
base_addr = 0xfe000000, length = 0x110000, reserved RAM
base_addr = 0xfec00000, length = 0x1000, reserved RAM
base_addr = 0xfee00000, length = 0x1000, reserved RAM
base_addr = 0xff000000, length = 0x1000000, reserved RAM


Concerning the 2TB disk that isn't in the system, ls gives some info on it (got the size from here). Not sure if there are commands to find out more about the phantom device.

grub> ls (hd3)
Device hd3: No known filesystem detected - Sector size 512B - Total size 2147483648KiB

As discussed on chat, the video is horribly slow as well. Normally we don't really need speed there, so not really an issue for us. It took almost 17 minutes for lsefi to complete (return prompt / done with output) on this machine. If you'd like to look into this and need more information please let me know what's desired and I'll see if I can dig it up :).

Comment 7 Ferry 2018-06-18 15:02:28 UTC
Created attachment 1452660 [details]
Pictures of the 4k/15" screen when running lsmmap

Comment 8 Peter Jones 2018-06-19 14:29:16 UTC
Created attachment 1452956 [details]
Proposed patch that may help the allocation problem

Possibly something like this patch will help - can you give it a shot, or do you need me to do a test build?

Comment 9 Ferry 2018-06-21 09:30:15 UTC
Hi Peter,

Thanks a lot for your work :).

I've attempted to test with your patch. Cloned the current git, built it without the patch to first determine it still occurred with the git version.

It didn't occur, it loads fine now.

Checked the patch and sources, but it seems resolved in a whole different way as many of the lines of original code in the patch file can not be found in the current git clone version. Didn't check them all, but in short:

First file, grub-core/kern/efi/mm.c. Line 58 should read
  grub_efi_boot_services_t *b;	
According to patch. File I have from git clone has the following on line 58 (less -N):
     58         grub_efi_physical_address_t address;

That's not expected. I find the line on 68, but don't see the to-be-patched if statement either:

     68   grub_efi_boot_services_t *b;
     69   struct efi_allocation *alloc;
     70   grub_efi_status_t status;
     71 
     72   b = grub_efi_system_table->boot_services;
     73   status = efi_call_3 (b->allocate_pool, GRUB_EFI_LOADER_DATA,
     74                            sizeof(*alloc), (void**)&alloc);
     75 
     76   if (status == GRUB_EFI_SUCCESS)
     77     {
     78       alloc->next = efi_allocated_memory;
     79       alloc->address = address;
     80       alloc->pages = pages;
     81       efi_allocated_memory = alloc;
     82     }


This seems rather different already. The to-be-patched if statement doesn't seem to exist at all any more and I suspect this to have been rewritten/restructured.

Didn't check all the others. Most import part is the 0x3fffffff address that is changed to GRUB_EFI_MAX_USABLE_ADDRESS in the patch on multiple lines of grub-core/loader/i386/linux.c. Read through the linux.c file, but to keep it short the only occurence of 0x3fffffff now is in a comment in the file:

[ferry@bcld-builder3 i386]$ pwd
/tmp/gs/grub/grub-core/loader/i386
[ferry@bcld-builder3 i386]$ grep -inr '0x3fff' linux.c
1082:	 0x3fffffff.  */

Full comment is:

   1080       /* XXX in reality, Linux specifies a bogus value, so
   1081          it is necessary to make sure that ADDR_MAX does not exceed
   1082          0x3fffffff.  */




It appears to me that there have been quite some changes to the memory allocation stuff already which seem to have resolved the issue.

It works with current git version. I presume we'll just have to wait for it to be officially released (we only use the secure boot version as some people have secure boot enabled and can't sign with anything the default EFI key stack trusts ;)).


For completeness, below how I built it (nothing special).


mkdir /tmp/gs; cd /tmp/gs
clone git://git.savannah.gnu.org/grub.git
cd grub
./configure --prefix=/opt/grub2-test --target=x86_64 --with-platform=efi
make
sudo -i
cd /tmp/gs/grub
make install
^D
cd /opt/grub2-test/bin
[ferry@bcld-builder3 bin]$ ./grub-mkstandalone --version
./grub-mkstandalone (GRUB) 2.03
./grub-mkstandalone --compress=xz -d /opt/grub2-test/lib/grub/x86_64-efi --modules="part_msdos linux lsmmap" -o /tmp/grub-test.efi -O x86_64-efi
[ferry@bcld-builder3 bin]$ ls -lh /tmp/grub-test.efi 
-rw-rw-r--. 1 ferry ferry 1.2M Jun 21 10:55 /tmp/grub-test.efi
[ferry@bcld-builder3 bin]$ cd /run/media/ferry/BCLD-USB/EFI/BOOT
[ferry@bcld-builder3 BOOT]$ cp /tmp/grub-test.efi bootx64.efi
[ferry@bcld-builder3 BOOT]$ cd /
[ferry@bcld-builder3 /]$ umount /run/media/ferry/BCLD-USB


---


Is it still desirable to test with this patch? Any specific source tree version you want me to apply it to if it is?

Comment 10 Fedora Update System 2018-08-30 17:06:34 UTC
grub2-2.02-52.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 11 Fedora Update System 2018-08-31 16:22:23 UTC
grub2-2.02-52.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 12 Fedora Update System 2018-09-01 01:49:36 UTC
grub2-2.02-52.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 13 Fedora Update System 2018-09-02 02:56:41 UTC
grub2-2.02-52.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 14 Ferry 2018-09-05 10:56:44 UTC
Hi,

it doesn't seem to work here.

I was only able to obtain the said version via the https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374 link, wasn't able to find in the repositories.

Downloaded that version, extracted boot/efi/EFI/fedora/grubx64.efi from that and replaced /EFI/BOOT/grubx64.efi on the stick with it (bootx64.efi in that folder is shimx64.efi from fc28 - I didn't update that).

It loads and drops me to grub prompt (as expected).

linuxefi (hd0,msdos1)/vmlinuz

throws error:

error: ../../grub-core/loader/i386/efi/linux.c:217:cannot allocate kernel parameters

Whilst looking in the repositories for this version I also found grub2-2.02-53.fc30. That gives me the same error.

Downloaded an older package, grub2-efi-x64-2.02.22.fc27.x86_64.rpm, did the same with that, that does load the kernel, but doesn't load the large initrd (error: can't allocate initrd.) as expected with that version.

I have tried specifying some bogus parameters, but this didn't help.

Comment 15 Nicolas Mailhot 2018-09-07 08:25:53 UTC
Does nor work either with grub2-2.02-53.fc30

Comment 16 Fedora Update System 2018-09-08 04:07:23 UTC
grub2-2.02-54.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 17 Fedora Update System 2018-09-08 16:11:17 UTC
grub2-2.02-54.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 18 Fedora Update System 2018-09-11 22:57:14 UTC
grub2-2.02-54.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-2756e3a374

Comment 19 Fedora Update System 2018-09-12 02:53:58 UTC
grub2-2.02-57.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 20 Ferry 2018-09-13 09:26:55 UTC
Hi, still seem to have some issues with grub2-efi-x64-2.02-57.fc29.x86_64.rpm

Both kernel (didn't load on the grub2-2.02-53.fc30 and grub2-2.02-52.fc30 versions) and initrd (had allocations issues with earlier versions) seem to load fine now. The prompt returns w/o any output after both
linuxefi (hd0,msdos1)/vmlinuz
initrdefi (hd0,msdos1)/initrd

Then issued
boot

But it's been stuck there for quite some time now. No more output has appeared. Been waiting for ~10 minutes now. The initrd is xz compressed, should be unpacked in <30 secs on this system.

Comment 21 Ben Cotton 2019-05-02 20:56:25 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 22 Ben Cotton 2019-05-28 22:25:02 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.