Bug 2101573

Summary: Regression when booting rhel8, but only on a rhel7 host
Product: [Fedora] Fedora Reporter: Colin Walters <walters>
Component: edk2Assignee: Paolo Bonzini <pbonzini>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 36CC: berrange, crobinso, kraxel, miabbott, pbonzini, philmd, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-25 18:19:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Colin Walters 2022-06-27 20:52:30 UTC
This is a crazy obscure bug, feel free to keep it low priority for now.  Basically, in some of the RHCOS flows we end up using:

- a rhel7 (openshift 3.11) bare metal host system
- a fedora 36 userspace run as a container (with qemu-kvm and edk2-ovmf notably) in there called 
- a rhel8 guest

When we updated coreos-assembler to f36, this new version of edk2-ovmf started triggering kernel panics; for more information see 
https://github.com/openshift/os/issues/862

Abbreviated trace is:

```
Jun 22 14:51:51.587132 kernel: Call Trace:
Jun 22 14:51:51.587140 kernel:  apic_bsp_setup+0x62/0x80
Jun 22 14:51:51.587145 kernel:  x86_late_time_init+0x29/0x39
Jun 22 14:51:51.587150 kernel:  start_kernel+0x486/0x542
Jun 22 14:51:51.587155 kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Jun 22 14:51:51.587160 kernel: ---[ end trace 02abcd4e6b23d7dc ]---
```

Downgrading to f35 edk2-ovmf fixes it.

Now, we know running rhel7 as a host system doesn't make sense; we're going to fix that.

I can't reproduce this when the host is rhel8 or current fedora.  It also doesn't reproduce when the guest is fedora.  Haven't tried rhel9 in this yet.

Comment 1 Colin Walters 2022-06-27 20:53:04 UTC
(Really wish BZ let me edit comments)

>  in there called ... https://github.com/coreos/coreos-assembler

Comment 2 Daniel Berrangé 2022-06-28 07:39:35 UTC
Probably not a directly related cause, but please note that QEMU and libvirt and other related virt packages have all explicitly dropped support for RHEL-7. Mostly this impacts userspace package build deps, but also means we no longer do any level of testing, and may assume we have newer host kernel features either at build time or runtime (though offhand I don't recall us doing that yet).

Comment 3 Gerd Hoffmann 2022-06-29 08:40:59 UTC
> Abbreviated trace is:
> 
> ```

Jun 22 14:51:51.587041 kernel: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1547 setup_local_APIC+0x269/0x370

> Jun 22 14:51:51.587132 kernel: Call Trace:
> Jun 22 14:51:51.587140 kernel:  apic_bsp_setup+0x62/0x80
> Jun 22 14:51:51.587145 kernel:  x86_late_time_init+0x29/0x39
> Jun 22 14:51:51.587150 kernel:  start_kernel+0x486/0x542
> Jun 22 14:51:51.587155 kernel:  secondary_startup_64_no_verify+0xc2/0xcb
> Jun 22 14:51:51.587160 kernel: ---[ end trace 02abcd4e6b23d7dc ]---
> ```

ovmf switched from acpi timer to local apic for time keeping recently,
so the local apic is in a different state now at kernel boot time.

Given this happens on RHEL-7 kernels only my first guess would be
this change triggers a lapic emulation bug in the kernel.

Comment 4 Ben Cotton 2023-04-25 17:30:34 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 5 Ludek Smid 2023-05-25 18:19:58 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.