Bug 1804953
Summary: | UEFI installs from live images fail since around Fedora-Rawhide-20200214.n.1 (boot loader entry creation fails) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> | ||||
Component: | efivar | Assignee: | Peter Jones <pjones> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 32 | CC: | airlied, browseria, bskeggs, bugzilla, fzatlouk, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, masami256, mchehab, mjg59, pbrobinson, pjones, robatino, steved | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | openqa AcceptedBlocker | ||||||
Fixed In Version: | efivar-37-6.fc32 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-02-27 15:11:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1705303 | ||||||
Attachments: |
|
Description
Adam Williamson
2020-02-19 23:57:04 UTC
*** Bug 1804956 has been marked as a duplicate of this bug. *** Chris has more details about the crash in the dupe. Chris, could you generate the *full* traceback? That'd be useful, I think. F31 with kernel 5.6.0-0.rc2.git0.1.fc32.x86_64, the problem doesn't happen. F32 with kernel 5.5.3, the problem does happen. So I don't think it's the kernel. My suspect is glibc-2.31-1.fc32.x86_64 (2020-02-04). Created attachment 1664216 [details]
gdb stack trace efibootmgr
I wonder if this is related to bug 1773175, which I'm still seeing on both F31 and F32. Confirmed this is affecting F32 as well as Rawhide (so blocker nomination stands). > Non-live UEFI installs are working OK.
Huh. So what's different between live and non-live? The non-live is actually still "LiveOS" based in terms of assembly, it just doesn't have a complex desktop environment. Right?
strlen.S is part of glibc, "parse_acpi_root" is found in efivar/linux-acpi-root.c
I dunno, ping pjones?
well, there are various things...I'm not sure it's worth trying to pin down from that angle, probably best just to work the crash from the trace and once we figure out the cause it'll probably become clear why it's not happening on the installer images... See also: bug 1804862 strace efibootmgr in the two environments is not that interesting netinstall: openat(AT_FDCWD, "/dev/vda", O_RDONLY) = 3 fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(0xfc, 0), ...}) = 0 readlink("/sys/dev/block/252:0", "../../devices/pci0000:00/0000:00"..., 4096) = 68 readlink("/sys/block/vda/device", "../../../virtio2", 4096) = 16 readlink("/sys/block/vda/device/driver", "../../../../../bus/virtio/driver"..., 4096) = 44 openat(AT_FDCWD, "/sys/devices/pci0000:00/firmware_node/path", O_RDONLY) = 4 live: openat(AT_FDCWD, "/dev/vda", O_RDONLY) = 3 fstat(3, {st_mode=S_IFBLK|0660, st_rdev=makedev(0xfc, 0), ...}) = 0 readlink("/sys/dev/block/252:0", "../../devices/pci0000:00/0000:00"..., 4096) = 68 readlink("/sys/block/vda/device", "../../../virtio2", 4096) = 16 readlink("/sys/block/vda/device/driver", "../../../../../bus/virtio/driver"..., 4096) = 44 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x564e2bc42d7d} --- +++ killed by SIGSEGV (core dumped) +++ # coredumpctl TIME PID UID GID SIG COREFILE EXE Sat 2020-02-22 03:05:36 EST 2390 0 0 11 present /usr/sbin/efibootmgr Sat 2020-02-22 03:18:59 EST 16906 0 0 11 present /mnt/sysroot/usr/sbin/efibootmgr # In between those crashes, I ran # rpm -Uvh --oldpackage https://kojipkgs.fedoraproject.org//packages/efivar/37/4.fc32/x86_64/efivar-libs-37-4.fc32.x86_64.rpm And now the first command no longer crashes (using GNOME Terminal), but the subsequent installation attempt fails. My guess, the live-base image contains efivar-libs-37-5, that's what gets installed, and that's what the installer runs. But I think downgrading install media to efivar-libs-37-4 will fix this bug. yes, 'downgrading' the package only affects the live overlay, it won't affect what the installer installs, and the command is run out of the installed system root so it'll be -5. The most obvious difference between -4 and -5 is that -5 will have been built with GCC 10, -4 with GCC 9. I checked the build logs but to a quick eyeball check they're identical, no juicy error or warning messages... Still, your theory is strange for one reason - efivar -5 was built on 20200128, but we had five composes (0131.n.0 through 0204.n.0) where install succeeded, *after* that date. The tests ultimately failed because of anaconda failing to reboot after install, but the bootloader install phase worked. Problem doesn't happen on baremetal, 1 for 1 attempt. Discussed during the 2020-02-24 blocker review meeting: [1] The decision to classify this bug as an AcceptedBlocker was made: "The installer must be able to complete an installation to a single disk using automatic partitioning" for x86_64 UEFI [1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2020-02-24/f32-blocker-review.2020-02-24-17.00.log.txt FEDORA-2020-8ef75170b3 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-8ef75170b3 FEDORA-2020-8ef75170b3 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-8ef75170b3 openQA testing confirms the fix for this; tests run with the updated efivar are all passing, tests run without it are all failing. efivar-37-6.fc32 has been pushed to the Fedora 32 stable repository. If problems still persist, please make note of it in this bug report. |