Description of problem: Boot a guest with "-prom-env 'auto-boot?=false'", the guest will stop at SLOF phrase with "0> ", then input "boot", the guest will continue to boot, but it failed to enter boot entry, it return to SLOF again. Version-Release number of selected component (if applicable): Host: 4.18.0-122.el8.ppc64le qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch How reproducible: 100% Steps to Reproduce: 1.Boot a guest with qemu cli "-prom-env 'auto-boot?=false'" 2.Guest will stop to boot at SLOF phrase as following: Ready! 0 > 3.Then type "boot" via VNC Ready! 0 > boot Actual results: Guest will return to SLOF and stop to boot again as : Ready! 0 > Expected results: Guest begin to enter the boot entry and enter the kernel. Additional info: If boot a guest without "-prom-env 'auto-boot?=false'", but press "s" at the early stage of boot, guest will stop at "0> " too, and then type boot, it return to SLOF again and then continue to boot automatically, the result is as following: Ready! 0 > boot Trying to load: from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ... Successfully loaded SLOF ********************************************************************** QEMU Starting Build Date = Jul 23 2019 04:40:55 FW Version = mockbuild@ release 20190703 Press "s" to enter Open Firmware. Press F12 for boot menu. Populating /vdevice methods Populating /vdevice/vty@30000000 Populating /vdevice/nvram@71000000 Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 8086 25ab system-periphal* 00 1800 (D) : 1af4 1003 virtio [ serial ] 00 2800 (D) : 1033 0194 serial bus [ usb-xhci ] 00 3000 (B) : 1b36 0001 pci* 01 3800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/pci@6/scsi@7 SCSI: Looking for devices 106000300000000 DISK : "QEMU QEMU HARDDISK 2.5+" 00 4800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/scsi@9 SCSI: Looking for devices 00 5000 (D) : 1af4 1000 virtio [ net ] 00 6000 (D) : 1af4 1002 unknown-legacy-device* Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard USB mouse No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ... Successfully loaded Linux ppc64le #1 SMP Wed Jun 2[ 1.495800] vio vio: uevent: failed to send synthetic uevent [ 4.166135] dracut-pre-pivot[648]: Jul 29 06:51:57 | /etc/multipath.conf does not exist, blacklisting all devices. [ 4.166222] dracut-pre-pivot[648]: Jul 29 06:51:57 | You can run "/sbin/mpathconf --enable" to create [ 4.166267] dracut-pre-pivot[648]: Jul 29 06:51:57 | /etc/multipath.conf. See man mpathconf(8) for more details [ OK ] Stopped Journal Service. Starting Journal Service... [ OK ] Listening on Device-mapper event daemon FIFOs. [ OK ] Stopped target Switch Root. .....
I only hit this issue on power9 with qemu4.1, power8 is ok, and I have tried some other builds, I guess maybe this is a qemu4.1 regression, the detail is as following: Hit this issue: qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch Hit this issue: qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le SLOF-20190114-2.gita5b428e.module+el8.1.0+3554+1a3a94a6 Not hit: qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3 SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch Not hit: qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3 SLOF-20190114-2.gita5b428e.module+el8.1.0+3554+1a3a94a6
P9 host hardware information: # lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 2 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800.0000 CPU min MHz: 2166.0000 L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127
I think the problem is related to the dual mode of the interrupt controller. To negotiate the IRQ mode, qemu forces a reset, and on this second reset, as the auto-boot parameter is kept to false SLOF stops again. If you issue the boot command again it should boot as expected. I experienced the same problem with the qemu parameter "-no-reboot" and I bisected to: bd94bc06479a ("spapr: change default interrupt mode to 'dual'") You may workaround the problem defaulting to XIVE on P9 machine: qemu-kvm ... -M ic-mode=xive ...
I also hit this issue when I type "reboot" in guest, SLOF phrase will be executed two times, as following: Host: 4.18.0-122.el8.ppc64le qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch Guest: 4.18.0-124.el8.ppc64le type "reboot" in guest, then console output: SLOF ********************************************************************** QEMU Starting Build Date = Jul 23 2019 04:40:55 FW Version = mockbuild@ release 20190703 Press "s" to enter Open Firmware. Press F12 for boot menu. Populating /vdevice methods Populating /vdevice/vty@30000000 Populating /vdevice/nvram@71000000 Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 8086 25ab system-periphal* 00 1800 (D) : 1af4 1003 virtio [ serial ] 00 2800 (D) : 1033 0194 serial bus [ usb-xhci ] 00 3000 (B) : 1b36 0001 pci* 01 3800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/pci@6/scsi@7 SCSI: Looking for devices 106000300000000 DISK : "QEMU QEMU HARDDISK 2.5+" 00 4800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/scsi@9 SCSI: Looking for devices 00 5000 (D) : 1af4 1000 virtio [ net ] 00 6000 (D) : 1af4 1002 unknown-legacy-device* Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard USB mouse No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ... Successfully loaded SLOF ********************************************************************** QEMU Starting Build Date = Jul 23 2019 04:40:55 FW Version = mockbuild@ release 20190703 Press "s" to enter Open Firmware. Press F12 for boot menu. Populating /vdevice methods Populating /vdevice/vty@30000000 Populating /vdevice/nvram@71000000 Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 8086 25ab system-periphal* 00 1800 (D) : 1af4 1003 virtio [ serial ] 00 2800 (D) : 1033 0194 serial bus [ usb-xhci ] 00 3000 (B) : 1b36 0001 pci* 01 3800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/pci@6/scsi@7 SCSI: Looking for devices 106000300000000 DISK : "QEMU QEMU HARDDISK 2.5+" 00 4800 (D) : 1af4 1004 virtio [ scsi ] Populating /pci@800000020000000/scsi@9 SCSI: Looking for devices 00 5000 (D) : 1af4 1000 virtio [ net ] 00 6000 (D) : 1af4 1002 unknown-legacy-device* Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard USB mouse No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ... Successfully loaded Linux ppc64le #1 SMP Mon Jul 2 Red Hat Enterprise Linux 8.1 Beta (Ootpa) Kernel 4.18.0-124.el8.ppc64le on an ppc64le Activate the web console with: systemctl enable --now cockpit.socket dhcp19-129-188 login: If add watchdog device, watchdog action is "reset", then qmp will prompt two "reset" event: {"timestamp": {"seconds": 1564466881, "microseconds": 246834}, "event": "WATCHDOG", "data": {"action": "reset"}} {"timestamp": {"seconds": 1564466881, "microseconds": 428248}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} {"timestamp": {"seconds": 1564466899, "microseconds": 365639}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}} {"timestamp": {"seconds": 1564466941, "microseconds": 782158}, "event": "RTC_CHANGE", "data": {"offset": 14402}} This issue does not exist on qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.ppc64le
Download and compile upstream qemu,hit this issue too. # /usr/bin/qemu-system-ppc64 -version QEMU emulator version 4.0.92 (qemu-kvm-4.1.0) Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers SLOF ********************************************************************** QEMU Starting Build Date = Jul 3 2019 12:26:14 FW Version = git-ba1ab360eebe6338 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/v-scsi@2000 SCSI: Looking for devices Populating /vdevice/vty@71000000 Populating /vdevice/nvram@71000001 Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 1033 0194 serial bus [ usb-xhci ] 00 1000 (D) : 1af4 1000 virtio [ net ] 00 2800 (D) : 1af4 1001 virtio [ block ] No NVRAM common partition, re-initializing... Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/scsi@5 ... Successfully loaded SLOF ********************************************************************** QEMU Starting Build Date = Jul 3 2019 12:26:14 FW Version = git-ba1ab360eebe6338 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/v-scsi@2000 SCSI: Looking for devices Populating /vdevice/vty@71000000 Populating /vdevice/nvram@71000001 Populating /pci@800000020000000 00 0000 (D) : 1234 1111 qemu vga 00 0800 (D) : 1033 0194 serial bus [ usb-xhci ] 00 1000 (D) : 1af4 1000 virtio [ net ] 00 2800 (D) : 1af4 1001 virtio [ block ] Installing QEMU fb Scanning USB XHCI: Initializing USB Keyboard No console specified using screen & keyboard Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@800000020000000/scsi@5 ... Successfully loaded Linux ppc64le #1 SMP Wed Jul 2eth0: flags=4098<BROADCAST,MULTICAST> mtu 1500 ether 40:f2:e9:5d:9c:07 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Red Hat Enterprise Linux 8.1 Beta (Ootpa) Kernel 4.18.0-122.el8.ppc64le on an ppc64le Activate the web console with: systemctl enable --now cockpit.socket dhcp19-129-145 login:
This is important enough that I want to get this in for RHEL-AV-8.2 timeframe.
This should be fixed upstream by: spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots That will be backported for BZ 1743477
> This should be fixed upstream by: > > spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots > > That will be backported for BZ 1743477 No, it's not. That change fixes the -no-reboot by finessing qemu's internal logic. However from SLOF's perspective the CAS reobot still looks like a new boot and triggers this problem. Fixing this will be trickier.
I've posted a change which should address this problem upstream.
Trying a draft downstream backport at: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=25572325
Verify the bug on following software versions according to steps in the bug Description part: Host kernel: 4.18.0-169.el8.ppc64le Guest kernel: 4.18.0-170.el8.ppc64le qemu-kvm-4.2.0-6.module+el8.2.0+5453+31b2b136.ppc64le SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2017