Bug 1033354 - BootLoaderError: failed to remove old efi boot entry
Summary: BootLoaderError: failed to remove old efi boot entry
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: 20
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Anaconda Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:55cf9399562c4d739015b39a2d5...
: 1033359 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-21 22:54 UTC by Chris Murphy
Modified: 2015-06-29 13:06 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-06-29 13:06:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: anaconda-tb (1.20 MB, text/plain)
2013-11-21 22:54 UTC, Chris Murphy
no flags Details
File: anaconda.log (43.94 KB, text/plain)
2013-11-21 22:54 UTC, Chris Murphy
no flags Details
File: environ (459 bytes, text/plain)
2013-11-21 22:54 UTC, Chris Murphy
no flags Details
File: journalctl (594.54 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: lsblk_output (2.97 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: nmcli_dev_list (4.34 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: os_info (375 bytes, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: program.log (63.37 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: storage.log (445.22 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: ifcfg.log (1.93 KB, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
File: packaging.log (209 bytes, text/plain)
2013-11-21 22:55 UTC, Chris Murphy
no flags Details
efivars (925 bytes, text/plain)
2013-11-22 04:33 UTC, Chris Murphy
no flags Details
efivars after crash (871 bytes, text/plain)
2013-11-22 04:35 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2013-11-21 22:54:43 UTC
Version-Release number of selected component:
anaconda-20.25.8-1.fc20.x86_64

The following was filed automatically by anaconda:
anaconda 20.25.8-1 exception report
Traceback (most recent call first):
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1650, in remove_efi_boot_target
    raise BootLoaderError("failed to remove old efi boot entry")
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1678, in install
    self.remove_efi_boot_target()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1713, in install
    super(MacEFIGRUB, self).install()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1698, in write
    self.install()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 2327, in writeBootLoader
    storage.bootloader.write()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 173, in doInstall
    writeBootLoader(storage, payload, instClass, ksdata)
  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 168, in run
    threading.Thread.run(self, *args, **kwargs)
BootLoaderError: failed to remove old efi boot entry

Additional info:
cmdline:        /usr/bin/python  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base --lang en_US.UTF-8
cmdline_file:   BOOT_IMAGE=/syslinux/vmlinuz0 root=live:UUID=7F1B-74BF ro rd.live.image quiet rhgb rd.live.check
executable:     /sbin/anaconda
hashmarkername: anaconda
kernel:         3.11.8-300.fc20.x86_64
other involved packages: python-libs-2.7.5-9.fc20.x86_64
product:        Fedora
release:        Fedora release 20 (Heisenbug)
type:           anaconda
version:        20

Comment 1 Chris Murphy 2013-11-21 22:54:50 UTC
Created attachment 827488 [details]
File: anaconda-tb

Comment 2 Chris Murphy 2013-11-21 22:54:53 UTC
Created attachment 827489 [details]
File: anaconda.log

Comment 3 Chris Murphy 2013-11-21 22:54:56 UTC
Created attachment 827490 [details]
File: environ

Comment 4 Chris Murphy 2013-11-21 22:55:01 UTC
Created attachment 827491 [details]
File: journalctl

Comment 5 Chris Murphy 2013-11-21 22:55:04 UTC
Created attachment 827492 [details]
File: lsblk_output

Comment 6 Chris Murphy 2013-11-21 22:55:08 UTC
Created attachment 827493 [details]
File: nmcli_dev_list

Comment 7 Chris Murphy 2013-11-21 22:55:11 UTC
Created attachment 827494 [details]
File: os_info

Comment 8 Chris Murphy 2013-11-21 22:55:14 UTC
Created attachment 827495 [details]
File: program.log

Comment 9 Chris Murphy 2013-11-21 22:55:17 UTC
Created attachment 827496 [details]
File: storage.log

Comment 10 Chris Murphy 2013-11-21 22:55:20 UTC
Created attachment 827497 [details]
File: ifcfg.log

Comment 11 Chris Murphy 2013-11-21 22:55:23 UTC
Created attachment 827498 [details]
File: packaging.log

Comment 12 Chris Murphy 2013-11-21 23:05:24 UTC
[ 1219.426123] efivars: DataSize & Attributes must be valid!
This appears to be a regression. Setting to kernel.

Comment 13 Chris Murphy 2013-11-21 23:10:07 UTC
Appears to be bug 975537 from F19. This wasn't happening with 3.11.7 or older.

Comment 14 Josh Boyer 2013-11-21 23:14:48 UTC
(In reply to Chris Murphy from comment #13)
> Appears to be bug 975537 from F19. This wasn't happening with 3.11.7 or
> older.

Try booting that again and removing things.  The code in this area hasn't changed for the entire 3.11 release.  It's likely an artifact of your UEFI firmware state, not code changes.

Comment 15 Chris Murphy 2013-11-21 23:20:21 UTC
Removing what things?

Comment 16 Chris Murphy 2013-11-21 23:38:12 UTC
*** Bug 1033359 has been marked as a duplicate of this bug. ***

Comment 17 Chris Murphy 2013-11-21 23:41:13 UTC
Reattempting install works. But subsequent install fails again. In ~30 installs to this same hardware during the F20 test cycle, TC2 is the first to exhibit this behavior.

Zapped PRAM with command-option-P-R, and installed F20 live desktop TC2. Succeeds. Reboot to install media and reinstall. Fails.

Comment 18 Josh Boyer 2013-11-22 02:19:05 UTC
Yes, that sounds like the variable space in the firmware is going through GC cycles on variable creation and deletion and hitting windows where there isn't enough space at runtime.  A reboot through firmware changes the state back and forth as GC is performed.  Matthew, do you agree?

Please provide the output of 'ls /sys/firmware/efi/efivars/'

Comment 19 Chris Murphy 2013-11-22 04:33:32 UTC
Created attachment 827562 [details]
efivars

This is after several reboots, then booting off install media to capture this output immediately prior to running the installer.

Comment 20 Chris Murphy 2013-11-22 04:35:59 UTC
Created attachment 827567 [details]
efivars after crash

This is efivars right after anaconda crashes. The crash is 100% reproducible every other install attempt.

Comment 21 Josh Boyer 2013-11-22 13:45:34 UTC
(In reply to Chris Murphy from comment #12)
> [ 1219.426123] efivars: DataSize & Attributes must be valid!
> This appears to be a regression. Setting to kernel.

Ok, so my original theory doesn't sound correct.  I don't think this is a regression in the kernel either.

Looking in the kernel, the only place that message is printed out is in the efivar_store_raw variable. It's complaining that either DataSize <= 0 or Attributes == 0.  The command that is failing is "efibootmgr -b 0000 -B".  That actually does 2 things.

First, it deletes the Boot0000 variable using the kernel's efivar sysfs interface.  That is done by writing the name of the variable to delete into /sys/firmware/efi/efivars/del_var, which then calls the efivar_delete function.  That succeeds just fine.  It notably does not call efivar_store_raw, so the message you see in dmesg isn't from the actual delte.

After efibootmgr deletes the variable, it then goes to modify the BootOrder variable to remove the Boot0000 entry from the order list.  That makes sense because if the variable is gone, it shouldn't be in BootOrder.  It does this by reading the current BootOrder variable, allocating a new list with the same starting DataSize, and looping over the existing entries, copying them if they aren't equal to the number being removed.  This is done in the 'remove_from_boot_order' function in efibootmgr.c.  It then calculates the new DataSize of whatever Boot entries should exist in BootOrder based on the number of entries it copied.  Then it calls edit_variable, which eventually gets us to writing the newly adjusted BootOrder variable into /sys/firmware/efi/efivars/raw_var.  That calls the efivar_store_raw function mentioned above and here we see that DataSize is 0.

Why is it 0?  Because efibootmgr doesn't account for someone trying to remove the only BootXXXX variable listed in BootOrder.  In the program.log you attached, we can see you have two BootXXXX vars:

15:52:44,687 INFO program: Running... efibootmgr
15:52:44,734 INFO program: BootCurrent: 0000
15:52:44,735 INFO program: BootOrder: 0000
15:52:44,735 INFO program: Boot0000* Fedora
15:52:44,735 INFO program: BootFFFF* 
15:52:44,735 DEBUG program: Return code: 0

So Boot0000 is the existing Fedora entry.  BootFFFF is something else, but only 0000 is listed in BootOrder.  So when efibootmgr does it's copy, it doesn't copy any items and this calculation:

	/* Now new_data has what we need */
	new_data_size = new_i * sizeof(uint16_t);

winds up being "new_data_size = 0 * sizeof(uint16_t);", which would clearly be 0.  That hits the check in the kernel, and the overall efibootmgr command fails.

I'm about 85% sure this is what is happening on your machine, because nothing else really makes sense.  How you wound up with BootOrder in this situation is unknown to me, but it isn't because of something the kernel did.

I'm not sure whether efibootmgr should disallow removing the only Boot entry listed in BootOrder or if anaconda could read BootOrder and see if the entry it's removing for the old entry is the only one and not do that.  Reassigning back to ananconda for now.

Comment 22 Chris Murphy 2013-11-22 18:56:24 UTC
If I zap PRAM to clear all NVRAM entries and then boot from the install media (a USB stick created with litd), I get this result prior to installing:

#efibootmgr -v
BootCurrent: 0000
BootFFFF* 	ACPI(a0341d0,0)PCI(1a,0)USB(1,0)HD(1,800,3ba800,faa7a92e-336d-4094-af6e-199897ad5f8b)File(\EFI\BOOT\BOOTX64.efi)

BootFFFF is the USB stick.

After installing twice, with crash on the 2nd attempt, this is the result of efibootmgr -v

[root@f20s liveuser]# efibootmgr -v
BootCurrent: 0000
BootOrder: 0000,0000
BootFFFF* 	ACPI(a0341d0,0)PCI(1a,0)USB(1,0)HD(1,800,3ba800,faa7a92e-336d-4094-af6e-199897ad5f8b)File(\EFI\BOOT\BOOTX64.efi)

It's not successfully removing BootOrder entries, If I insert a new boot entry:
efibootmgr -c -w -L Fedora -d /dev/sda -p 4 -l \EFI\fedora\shim.efi
Then remove it:
efibootmgr -b 0000 -B

I now have:
# efibootmgr -v
BootCurrent: 0000
BootOrder: 0000,0000,0000
BootFFFF* 	ACPI(a0341d0,0)PCI(1a,0)USB(1,0)HD(1,800,3ba800,faa7a92e-336d-4094-af6e-199897ad5f8b)File(\EFI\BOOT\BOOTX64.efi)

So either this is a firmware bug, or efibootmgr/kernel should remove the bootorder first, and then the entry. Because the entry is removed, the firmware doesn't seem to understand the idea of removing the bootorder entry for an entry that doesn't exist anymore.

But anaconda shouldn't crash. If it merely ignored the error, and created a new entry anyway, the system would at least be bootable. Because of this failure, numerous post-install events don't complete making the system unbootable: mactel-boot, grub.cfg, and initramfs.

Comment 23 jfceklosky 2013-12-24 05:09:32 UTC
Here is another failure with a kernel EFI error message.  There is something really wrong when removing or setting the boot partition.

From Anaconda logs:

anaconda 20.25.15-1 exception report
Traceback (most recent call first):
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1649, in remove_efi_boot_target
    raise BootLoaderError("failed to remove old efi boot entry")
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1677, in install
    self.remove_efi_boot_target()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 1697, in write
    self.install()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/bootloader.py", line 2328, in writeBootLoader
    storage.bootloader.write()
  File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 173, in doInstall
    writeBootLoader(storage, payload, instClass, ksdata)
  File "/usr/lib64/python2.7/threading.py", line 764, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 192, in run
    threading.Thread.run(self, *args, **kwargs)
BootLoaderError: failed to remove old efi boot entry

Local variables in innermost frame:
slot: Boot0000*
self: <pyanaconda.bootloader.EFIGRUB object at 0x7f0ac4b0ec90>
slot_id: 0000
_product: Fedora
rc: 1
line: Boot0000* Fedora
buf: ** Warning ** : Boot000a is not EFI 1.10 compliant (lowercase hex in name)
** Warning ** : please recreate these using efibootmgr to remove this warning.
BootCurrent: 000A
Timeout: 1 seconds
BootOrder: 0002,0000,0001,0003,0004,0005,0006,000A
Boot0000* Fedora
Boot0001* Hard Drive
Boot0002* Fedora
Boot0003* Network Card
Boot0004* CD/DVD Drive
Boot0005* UEFI: INTEL SSDSA2
Boot0006* UEFI: INTEL SSDSA2
Boot000a* UEFI: HL-DT-ST DVDRAM GSA-H62N



From dmesg:

05:12:27,252 WARNING kernel:[ 1053.934502] efivars: set_variable() failed: status=-28

Comment 24 Fedora End Of Life 2015-05-29 09:49:34 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 25 Fedora End Of Life 2015-06-29 13:06:13 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.