RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2035158 - [ppc64le][RHEL9 guest]qemu-kvm quits silently while booting rhel9 guest with "max-cpu-compat=power8" parameter
Summary: [ppc64le][RHEL9 guest]qemu-kvm quits silently while booting rhel9 guest with ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: Documentation
Version: 8.6
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jiri Herrmann
QA Contact: Min Deng
Jiri Herrmann
URL:
Whiteboard:
: 2060522 (view as bug list)
Depends On:
Blocks: 2010412
TreeView+ depends on / blocked
 
Reported: 2021-12-23 06:03 UTC by Min Deng
Modified: 2023-02-06 10:37 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.RHEL 9 virtual machines fail to boot in POWER8 compatibility mode Currently, booting a virtual machine (VM) that runs RHEL 9 as its guest operating system fails if the VM also uses CPU configuration similar to the following: ---- <cpu mode="host-model"> <model>power8</model> </cpu> ---- To work around this problem, do not use POWER8 compatibility mode in RHEL 9 VMs. In addition, note that running RHEL 9 VMs is not possible on POWER8 hosts.
Clone Of:
Environment:
Last Closed: 2022-07-19 13:26:28 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 2040657 1 unspecified CLOSED glibc: More reliable CPU compatibility diagnostics 2023-07-18 14:29:19 UTC
Red Hat Issue Tracker RHELPLAN-106475 0 None None None 2021-12-23 06:10:30 UTC

Internal Links: 2040657

Description Min Deng 2021-12-23 06:03:05 UTC
Description of problem:
[ppc64le]qemu-kvm quite driectly while rhel9 guest with "max-cpu-compat=power8" parameter
Version-Release number of selected component (if applicable):
kernel-5.14.0-37.el9.ppc64le
qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949.ppc64le
SLOF-20210217-1.module+el8.6.0+12861+13975d62.noarch

How reproducible:
3/3

Steps to Reproduce:
1./usr/libexec/qemu-kvm -S -name avocado-vt-vm1 -sandbox on -machine pseries,max-cpu-compat=power8,memory-backend=mem-machine_mem -nodefaults -device VGA,bus=pci.0,addr=0x2 -m 8192 -object memory-backend-ram,size=8192M,id=mem-machine_mem -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 -cpu host -chardev socket,wait=off,id=chardev_serial0,path=/tmp/tt,server=on -device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/rhel900-ppc64le-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 -device scsi-hd,id=image1,drive=drive_image1,write-cache=on -vnc :0 -rtc base=utc,clock=host -boot menu=off,order=cdn,once=c,strict=off -enable-kvm -monitor stdio
2.cont
3.

Actual results:
Qemu quit directly...
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 5.14.0-37.el9.ppc64le (mockbuild.eng.bos.redhat.com) (gcc (GCC) 11.2.1 20211019 (Red Hat 11.2.1-6), GNU ld version 2.35.2-13.el9) #1 SMP Wed Dec 22 13:55:19 EST 2021
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-5.14.0-37.el9.ppc64le root=/dev/mapper/rhel_dhcp16--215--154-root ro console=ttyS0,115200 crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G rd.lvm.lv=rhel_dhcp16-215-154/root rd.lvm.lv=rhel_dhcp16-215-154/swap biosdevname=0 net.ifnames=0 rhgb quiet console=tty0 movable_node
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000007ca0000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000200000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000200000000
found display   : /pci@800000020000000/vga@2, opening... done
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000007cb0000 -> 0x0000000007cb0b5d
Device tree struct  0x0000000007cc0000 -> 0x0000000007cd0000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000440000 ...
Linux ppc64le

Expected results:
If removing max-cpu-compat=power8, there was no any problems
Additional info:

Comment 1 Min Deng 2021-12-23 06:14:15 UTC
In my opinions, let's provide a decent way to inform users if it's supported or not in this situation, but NOT qemu-kvm terminated directly, thanks a lot.

Comment 3 David Gibson 2022-01-05 00:48:18 UTC
AIUI RHEL9 has dropped POWER8 CPU support, so it's not expected for this combination to work.  The only issue here is that it's not usefully reported.  I'm not immediately sure how to fix that, but I think we can drop the priority a notch.

Comment 4 David Gibson 2022-01-06 05:18:20 UTC
Min, could you provide your RHEL9 disk image.  If I'm able to use that it should save a lot of time over installing one myself.

Comment 5 Min Deng 2022-01-06 06:15:07 UTC
Yes, of course. I'm doing it now.

Comment 7 Greg Kurz 2022-01-06 09:24:43 UTC
If the rhel9 guest was installed without max-cpu-compat=power8, then it is expected than the OS
installed in a POWER9 environment will crash. At least that's what I get when I try with my
own image:

[    0.760021] init[1]: illegal instruction (4) at 7fff9650261c nip 7fff9650261c lr 7fff964d3108 code 1 in ld64.so.2[7fff964d0000+50000]
[    0.761377] init[1]: code: 394a0008 38a00000 38c01000 e8e28140 f9446dd0 3c82ffff 2c290000 f8a46d70 
[    0.762383] init[1]: code: 3ca2ffff f8e10060 f8c56d88 41820438 <f3e002d0> fa410078 fa610080 38e00000 
[    0.763420] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

It is expected as well for QEMU to exit in this case.

Downgrading the CPU version doesn't make sense in the first place and QEMU has no way to know
what's going on until the OS crashes. For these reasons, I'd rather close this as NOTABUG. Maybe
it could be documented somewhere that rhel9 isn't to be run with max-cpu-compat=power8 to make
this clear.

Comment 8 Min Deng 2022-01-06 10:05:00 UTC
...
> it could be documented somewhere that rhel9 isn't to be run with
> max-cpu-compat=power8 to make
> this clear.
Thanks for your info.
If so, at least, let us document this issue somewhere in my opinions. We can use this bug to trace the document bug and also drop HPT test cases from test plan for rhel9.0 guest. Thanks.

Comment 10 Florian Weimer 2022-01-11 09:48:26 UTC
ld64.so.2 is built with POWER9 optimizations because everything in RHEL 9 is. We no longer have a POWER9 multilib for glibc in RHEL 9.

GCC now uses floating point and vector instructions for compiling the early startup code in glibc. I'll ask our colleagues at IBM what's up with that. Due to that, we can't reach the code that prints the error message which says that POWER9 is required. Previously, GCC only used integer instructions, which is why we got the error message.

Comment 11 Greg Kurz 2022-01-11 10:49:16 UTC
Ok then I guess it is already documented somewhere that a RHEL9 guest
cannot run on a POWER8 host. It should be documented that it cannot
run in POWER8 compat mode on any system as well.

Upstream kernel code doesn't allow to restrict the available compat modes
in CAS, but this could be done downstream. Even with that, I've just
double-checked : QEMU would print "Couldn't negotiate a suitable PVR during
CAS" and return H_HARDWARE to the guest, which is then ignored by the linux
kernel... this means that the CPU will continue to run in raw mode : the
guest will crash on a POWER8 host and it should run on a POWER9 host. This
looks worse than the consistent SIGILL crash we're getting now.


I'll wait for some more feedback from the discussion between Florian and IBM.
Unless there's an easy way to print something like "POWER9 is required", I'll
close this as WONTFIX.

Comment 12 Florian Weimer 2022-01-14 11:39:52 UTC
I haven't heard back from IBM, but I filed glibc bug 2040657 to fix this properly.

Comment 13 Min Deng 2022-02-14 07:13:35 UTC
QE will re-test this bug according to above comment 12.

Comment 14 Min Deng 2022-03-03 02:10:26 UTC
Tired the build with (In reply to Florian Weimer from comment #12)
> I haven't heard back from IBM, but I filed glibc bug 2040657 to fix this
> properly.

About this fix, I didn't see it fixed this issue since we only have glibc-2.34-25.el9 for rhel9 but not rhel8, it's rhel 8 host.
    0.928191] Freeing unused kernel image (initmem) memory: 5440K
[    1.016863] Run /init as init process
Fatal glibc error: CPU lacks ISA 3.00 support (POWER9 or later required)
[    1.017642] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[    1.018075] CPU: 1 PID: 1 Comm: init Not tainted 5.14.0-69.el9.ppc64le #1
[    1.018468] Call Trace:
[    1.018819] [c00000002c61fbf0] [c000000000811240] dump_stack_lvl+0x74/0xa8 (unreliable)
[    1.019245] [c00000002c61fc30] [c00000000015a588] panic+0x174/0x40c
[    1.019639] [c00000002c61fcd0] [c000000000164050] do_exit+0x490/0x520
[    1.020024] [c00000002c61fd50] [c0000000001641b0] do_group_exit+0x60/0xd0
[    1.020395] [c00000002c61fd90] [c000000000164244] sys_exit_group+0x24/0x30
[    1.020758] [c00000002c61fdb0] [c0000000000305d0] system_call_exception+0x160/0x300
[    1.021137] [c00000002c61fe10] [c00000000000c6cc] system_call_common+0xec/0x250
[    1.021514] --- interrupt: c00 at 0x7fff7fe3eb00
[    1.021864] NIP:  00007fff7fe3eb00 LR: 00007fff7fe1db60 CTR: 0000000000000000
[    1.022249] REGS: c00000002c61fe80 TRAP: 0c00   Not tainted  (5.14.0-69.el9.ppc64le)
[    1.022635] MSR:  800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 24000422  XER: 00000000
[    1.023047] IRQMASK: 0 
[    1.023047] GPR00: 00000000000000ea 00007fffd3dfc130 00007fff7fe67e00 000000000000007f 
[    1.023047] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    1.023047] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    1.023047] GPR12: 0000000000000000 0000000000000000 00007fff7fe5fdf8 000000000000fff1 
[    1.023047] GPR16: 00007fff7fe01d38 00007fff7fe003d0 000000000005fdf8 00007fff7fe00000 
[    1.023047] GPR20: 0000000000000000 0000000000000001 0000000080001000 000000007fff9000 
[    1.023047] GPR24: 0000000000000001 00000000ffffffff 0000000000010000 00007fffd3dfc2b0 
[    1.023047] GPR28: 00007fffd3dfc870 00007fff7fe611c0 fffffffffffff000 000000000000007f 
[    1.027816] NIP [00007fff7fe3eb00] 0x7fff7fe3eb00
[    1.028346] LR [00007fff7fe1db60] 0x7fff7fe1db60
[    1.028880] --- interrupt: c00

[root@ibm-p9wr-13 home]# uname -r
4.18.0-367.el8.ppc64le
[root@ibm-p9wr-13 home]# rpm -qa|grep qemu-kvm
qemu-kvm-common-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-block-ssh-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-block-curl-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-block-rbd-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-docs-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-core-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le
qemu-kvm-block-iscsi-6.2.0-8.module+el8.6.0+14326+31a26998.ppc64le

Comment 15 Florian Weimer 2022-03-03 04:53:13 UTC
(In reply to Min Deng from comment #14)
> Tired the build with (In reply to Florian Weimer from comment #12)
> > I haven't heard back from IBM, but I filed glibc bug 2040657 to fix this
> > properly.
> 
> About this fix, I didn't see it fixed this issue since we only have
> glibc-2.34-25.el9 for rhel9 but not rhel8, it's rhel 8 host.
>     0.928191] Freeing unused kernel image (initmem) memory: 5440K
> [    1.016863] Run /init as init process
> Fatal glibc error: CPU lacks ISA 3.00 support (POWER9 or later required)

The fix I mentioned results in printing this line. Before the change, this line would not show up.

We are not going to make Red Hat Enterprise Linux 9 compatible with POWER8, so there is nothing else we can do.

Comment 16 Greg Kurz 2022-03-03 12:10:21 UTC
(In reply to Florian Weimer from comment #15)
> (In reply to Min Deng from comment #14)
> > Tired the build with (In reply to Florian Weimer from comment #12)
> > > I haven't heard back from IBM, but I filed glibc bug 2040657 to fix this
> > > properly.
> > 
> > About this fix, I didn't see it fixed this issue since we only have
> > glibc-2.34-25.el9 for rhel9 but not rhel8, it's rhel 8 host.
> >     0.928191] Freeing unused kernel image (initmem) memory: 5440K
> > [    1.016863] Run /init as init process
> > Fatal glibc error: CPU lacks ISA 3.00 support (POWER9 or later required)
> 
> The fix I mentioned results in printing this line. Before the change, this
> line would not show up.
> 
> We are not going to make Red Hat Enterprise Linux 9 compatible with POWER8,
> so there is nothing else we can do.

As said in another comment, installing a guest and then downgrading the CPU
version is unsound. Any piece of code in the guest that assumes it is running
with the same CPU version that was used during deployment would break all
the same. Imagine a customer workload that doesn't even use glibc : it would
crash with SIGILL and someone would need to spend time understanding what's
going on.

Fortunately, doing so with a RHEL9 guest crashes /init early. This gives us
the opportunity to report a clear error in the console. This is really the
best we can do to handle this case.

Comment 17 Greg Kurz 2022-03-03 12:13:45 UTC
Just to clarify, I fully agree with Florian's statement. The previous comment
is addressed at Min.

Comment 18 Min Deng 2022-03-08 03:31:45 UTC
Moving it to document component, someone else also hit the same problem, let us document it.

Comment 19 Nitesh Narayan Lal 2022-03-09 14:55:56 UTC
*** Bug 2060522 has been marked as a duplicate of this bug. ***

Comment 33 Dan Zheng 2023-02-06 10:37:03 UTC
Hi Jiri, also confirmed with mdeng, there is no bug fix. So you can keep the existing note. Thanks.

Dan


Note You need to log in before you can comment on or make changes to this bug.