Bug 1733893 - Boot a guest with "-prom-env 'auto-boot?=false'", SLOF failed to enter the boot entry after input "boot" followed by "0 > " on VNC
Summary: Boot a guest with "-prom-env 'auto-boot?=false'", SLOF failed to enter the bo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: 8.2
Assignee: David Gibson
QA Contact: xianwang
URL:
Whiteboard:
Depends On:
Blocks: 1711971
TreeView+ depends on / blocked
 
Reported: 2019-07-29 07:06 UTC by xianwang
Modified: 2020-05-05 09:46 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-4.2.0-6.module+el8.2.0+5451+991cea0d
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 09:46:33 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 183212 0 None None None 2020-01-15 11:04:36 UTC

Description xianwang 2019-07-29 07:06:09 UTC
Description of problem:
Boot a guest with "-prom-env 'auto-boot?=false'", the guest will stop at SLOF phrase with "0> ", then input "boot", the guest will continue to boot, but it failed to enter boot entry, it return to SLOF again.

Version-Release number of selected component (if applicable):
Host:
4.18.0-122.el8.ppc64le
qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with qemu cli "-prom-env 'auto-boot?=false'"
2.Guest will stop to boot at SLOF phrase as following:

Ready! 
0 >

3.Then type "boot" via VNC
Ready! 
0 > boot

Actual results:
Guest will return to SLOF and stop to boot again as :
Ready! 
0 >

Expected results:
Guest begin to enter the boot entry and enter the kernel.

Additional info:
If boot a guest without "-prom-env 'auto-boot?=false'", but press "s" at the early stage of boot, guest will stop at "0> " too, and then type boot, it return to SLOF again and then continue to boot automatically, the result is as following:

Ready! 
0 > boot  
Trying to load:  from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ...   Successfully loaded


SLOF **********************************************************************
QEMU Starting
 Build Date = Jul 23 2019 04:40:55
 FW Version = mockbuild@ release 20190703
 Press "s" to enter Open Firmware.

Press F12 for boot menu.

Populating /vdevice methods
Populating /vdevice/vty@30000000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
                     00 0000 (D) : 1234 1111    qemu vga
                     00 0800 (D) : 8086 25ab    system-periphal*
                     00 1800 (D) : 1af4 1003    virtio [ serial ]
                     00 2800 (D) : 1033 0194    serial bus [ usb-xhci ]
                     00 3000 (B) : 1b36 0001    pci*
                     01 3800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/pci@6/scsi@7
       SCSI: Looking for devices
          106000300000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
                     00 4800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/scsi@9
       SCSI: Looking for devices
                     00 5000 (D) : 1af4 1000    virtio [ net ]
                     00 6000 (D) : 1af4 1002    unknown-legacy-device*
Installing QEMU fb



Scanning USB 
  XHCI: Initializing
    USB Keyboard 
    USB mouse 
No console specified using screen & keyboard
     




  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ...   Successfully loaded
Linux ppc64le
#1 SMP Wed Jun 2[    1.495800] vio vio: uevent: failed to send synthetic uevent
[    4.166135] dracut-pre-pivot[648]: Jul 29 06:51:57 | /etc/multipath.conf does not exist, blacklisting all devices.
[    4.166222] dracut-pre-pivot[648]: Jul 29 06:51:57 | You can run "/sbin/mpathconf --enable" to create
[    4.166267] dracut-pre-pivot[648]: Jul 29 06:51:57 | /etc/multipath.conf. See man mpathconf(8) for more details
[  OK  ] Stopped Journal Service.
         Starting Journal Service...
[  OK  ] Listening on Device-mapper event daemon FIFOs.
[  OK  ] Stopped target Switch Root.
.....

Comment 1 xianwang 2019-07-29 07:12:41 UTC
I only hit this issue on power9 with qemu4.1, power8 is ok, and I have tried some other builds, I guess maybe this is a qemu4.1 regression, the detail is as following:

Hit this issue:
qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

Hit this issue:
qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le
SLOF-20190114-2.gita5b428e.module+el8.1.0+3554+1a3a94a6

Not hit:
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

Not hit:
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3
SLOF-20190114-2.gita5b428e.module+el8.1.0+3554+1a3a94a6

Comment 2 xianwang 2019-07-29 09:22:13 UTC
P9 host hardware information:
# lscpu
Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  4
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        2
Model:               2.2 (pvr 004e 1202)
Model name:          POWER9, altivec supported
CPU max MHz:         3800.0000
CPU min MHz:         2166.0000
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            10240K
NUMA node0 CPU(s):   0-63
NUMA node8 CPU(s):   64-127

Comment 4 Laurent Vivier 2019-07-29 20:03:59 UTC
I think the problem is related to the dual mode of the interrupt controller.

To negotiate the IRQ mode, qemu forces a reset, and on this second reset, as the auto-boot parameter is kept to false SLOF stops again.

If you issue the boot command again it should boot as expected.

I experienced the same problem with the qemu parameter "-no-reboot" and I bisected to:

    bd94bc06479a ("spapr: change default interrupt mode to 'dual'")

You may workaround the problem defaulting to XIVE on P9 machine:

  qemu-kvm ... -M ic-mode=xive ...

Comment 5 xianwang 2019-07-30 06:41:00 UTC
I also hit this issue when I type "reboot" in guest, SLOF phrase will be executed two times, as following:
Host:
4.18.0-122.el8.ppc64le
qemu-kvm-4.1.0-0.el8.patchwork201907241645.ppc64le
SLOF-20190703-1.gitba1ab360.module+el8.1.0+3730+7d905127.noarch

Guest:
4.18.0-124.el8.ppc64le

type "reboot" in guest, then console output:

SLOF **********************************************************************
QEMU Starting
 Build Date = Jul 23 2019 04:40:55
 FW Version = mockbuild@ release 20190703
 Press "s" to enter Open Firmware.

Press F12 for boot menu.

Populating /vdevice methods
Populating /vdevice/vty@30000000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
                     00 0000 (D) : 1234 1111    qemu vga
                     00 0800 (D) : 8086 25ab    system-periphal*
                     00 1800 (D) : 1af4 1003    virtio [ serial ]
                     00 2800 (D) : 1033 0194    serial bus [ usb-xhci ]
                     00 3000 (B) : 1b36 0001    pci*
                     01 3800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/pci@6/scsi@7
       SCSI: Looking for devices
          106000300000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
                     00 4800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/scsi@9
       SCSI: Looking for devices
                     00 5000 (D) : 1af4 1000    virtio [ net ]
                     00 6000 (D) : 1af4 1002    unknown-legacy-device*
Installing QEMU fb



Scanning USB 
  XHCI: Initializing
    USB Keyboard 
    USB mouse 
No console specified using screen & keyboard
     




  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ...   Successfully loaded


SLOF **********************************************************************
QEMU Starting
 Build Date = Jul 23 2019 04:40:55
 FW Version = mockbuild@ release 20190703
 Press "s" to enter Open Firmware.

Press F12 for boot menu.

Populating /vdevice methods
Populating /vdevice/vty@30000000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
                     00 0000 (D) : 1234 1111    qemu vga
                     00 0800 (D) : 8086 25ab    system-periphal*
                     00 1800 (D) : 1af4 1003    virtio [ serial ]
                     00 2800 (D) : 1033 0194    serial bus [ usb-xhci ]
                     00 3000 (B) : 1b36 0001    pci*
                     01 3800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/pci@6/scsi@7
       SCSI: Looking for devices
          106000300000000 DISK     : "QEMU     QEMU HARDDISK    2.5+"
                     00 4800 (D) : 1af4 1004    virtio [ scsi ]
Populating /pci@800000020000000/scsi@9
       SCSI: Looking for devices
                     00 5000 (D) : 1af4 1000    virtio [ net ]
                     00 6000 (D) : 1af4 1002    unknown-legacy-device*
Installing QEMU fb



Scanning USB 
  XHCI: Initializing
    USB Keyboard 
    USB mouse 
No console specified using screen & keyboard
     




  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/pci@6/scsi@7/disk@106000300000000 ...   Successfully loaded
Linux ppc64le
#1 SMP Mon Jul 2
Red Hat Enterprise Linux 8.1 Beta (Ootpa)
Kernel 4.18.0-124.el8.ppc64le on an ppc64le

Activate the web console with: systemctl enable --now cockpit.socket

dhcp19-129-188 login:

If add watchdog device, watchdog action is "reset", then qmp will prompt two "reset" event:
{"timestamp": {"seconds": 1564466881, "microseconds": 246834}, "event": "WATCHDOG", "data": {"action": "reset"}}
{"timestamp": {"seconds": 1564466881, "microseconds": 428248}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}
{"timestamp": {"seconds": 1564466899, "microseconds": 365639}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}
{"timestamp": {"seconds": 1564466941, "microseconds": 782158}, "event": "RTC_CHANGE", "data": {"offset": 14402}}

This issue does not exist on qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.ppc64le

Comment 6 Zhenyu Zhang 2019-08-02 08:03:28 UTC
 Download and compile upstream qemu,hit this issue too.
# /usr/bin/qemu-system-ppc64 -version
QEMU emulator version 4.0.92 (qemu-kvm-4.1.0)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

SLOF **********************************************************************
QEMU Starting
 Build Date = Jul  3 2019 12:26:14
 FW Version = git-ba1ab360eebe6338
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/v-scsi@2000
       SCSI: Looking for devices
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 0000 (D) : 1234 1111    qemu vga
                     00 0800 (D) : 1033 0194    serial bus [ usb-xhci ]
                     00 1000 (D) : 1af4 1000    virtio [ net ]
                     00 2800 (D) : 1af4 1001    virtio [ block ]
No NVRAM common partition, re-initializing...
Installing QEMU fb



Scanning USB 
  XHCI: Initializing
    USB Keyboard 
No console specified using screen & keyboard
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/scsi@5 ...   Successfully loaded


SLOF **********************************************************************
QEMU Starting
 Build Date = Jul  3 2019 12:26:14
 FW Version = git-ba1ab360eebe6338
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/v-scsi@2000
       SCSI: Looking for devices
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /pci@800000020000000
                     00 0000 (D) : 1234 1111    qemu vga
                     00 0800 (D) : 1033 0194    serial bus [ usb-xhci ]
                     00 1000 (D) : 1af4 1000    virtio [ net ]
                     00 2800 (D) : 1af4 1001    virtio [ block ]
Installing QEMU fb



Scanning USB 
  XHCI: Initializing
    USB Keyboard 
No console specified using screen & keyboard
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/scsi@5 ...   Successfully loaded
Linux ppc64le
#1 SMP Wed Jul 2eth0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 40:f2:e9:5d:9c:07  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


Red Hat Enterprise Linux 8.1 Beta (Ootpa)
Kernel 4.18.0-122.el8.ppc64le on an ppc64le

Activate the web console with: systemctl enable --now cockpit.socket

dhcp19-129-145 login:

Comment 7 David Gibson 2019-09-02 03:57:36 UTC
This is important enough that I want to get this in for RHEL-AV-8.2 timeframe.

Comment 8 Laurent Vivier 2019-09-02 10:19:09 UTC
This should be fixed upstream by:

  spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots

That will be backported for BZ 1743477

Comment 9 David Gibson 2019-09-03 00:10:16 UTC
> This should be fixed upstream by:
> 
>   spapr: Use SHUTDOWN_CAUSE_SUBSYSTEM_RESET for CAS reboots
> 
> That will be backported for BZ 1743477

No, it's not.  That change fixes the -no-reboot by finessing qemu's internal logic.  However from SLOF's perspective the CAS reobot still looks like a new boot and triggers this problem.  Fixing this will be trickier.

Comment 10 David Gibson 2019-11-29 05:34:48 UTC
I've posted a change which should address this problem upstream.

Comment 11 David Gibson 2020-01-02 05:33:42 UTC
Trying a draft downstream backport at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=25572325

Comment 15 Gu Nini 2020-01-17 02:30:34 UTC
Verify the bug on following software versions according to steps in the bug Description part:

Host kernel: 4.18.0-169.el8.ppc64le
Guest kernel: 4.18.0-170.el8.ppc64le
qemu-kvm-4.2.0-6.module+el8.2.0+5453+31b2b136.ppc64le
SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch

Comment 16 Ademar Reis 2020-02-05 23:01:36 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 18 errata-xmlrpc 2020-05-05 09:46:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.