Bug 1778704 - Guest failed to boot up with device under pci-bridge
Summary: Guest failed to boot up with device under pci-bridge
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: SLOF
Version: 8.2
Hardware: ppc64le
OS: Linux
high
high
Target Milestone: rc
: 8.2
Assignee: David Gibson
QA Contact: Gu Nini
URL:
Whiteboard:
Depends On: 1804038
Blocks: 1711971
TreeView+ depends on / blocked
 
Reported: 2019-12-02 10:50 UTC by Gu Nini
Modified: 2020-05-05 09:51 UTC (History)
18 users (show)

Fixed In Version: SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 09:51:23 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
guest_serial_log-12022019 (12.38 KB, text/plain)
2019-12-02 10:50 UTC, Gu Nini
no flags Details
serial_log_when_used_pci_bridge_as_system_disk-01162020 (42.90 KB, text/plain)
2020-01-16 09:54 UTC, Gu Nini
no flags Details


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 183129 0 None None None 2020-01-08 23:46:15 UTC
Red Hat Product Errata RHBA-2020:2017 0 None None None 2020-05-05 09:51:57 UTC

Description Gu Nini 2019-12-02 10:50:13 UTC
Created attachment 1641335 [details]
guest_serial_log-12022019

Description of problem:
When boot a guest with pci-bridge device, such as virtio-balloon device, virtio-scsi disk, virtio-blk disk, virtio-net device, etc; it's a failure to boot up the guest and the it hangs at following point:

OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.18.0-159.el8.ppc64le (mockbuild.eng.bos.redhat.com) (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)) #1 SMP Sat Nov 30 13:59:46 UTC 2019
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=/vmlinuz-4.18.0-159.el8.ppc64le root=/dev/mapper/rhel_dhcp16--213--204-root ro console=ttyS0,115200 crashkernel=auto rd.lvm.lv=rhel_dhcp16-213-204/root rd.lvm.lv=rhel_dhcp16-213-204/swap biosdevname=0 net.ifnames=0 console=tty0 biosdevname=0 net.ifnames=0 console=hvc0,38400
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support...

For the whole log for the guest serial output, please check the attachment 'guest _serial_log-12022019'.


Version-Release number of selected component (if applicable):
Host kernel: 4.18.0-157.el8.ppc64le
Qemu: qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb.ppc64le
SLOF: SLOF-20191022-1.git899d9883.module+el8.2.0+4793+b09dd2fb.noarch

How reproducible:
100%

Steps to Reproduce:
1. Boot up a guest with pci-bridge device, such as the virtio-blk device:

/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine pseries \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2 \
    -m 8192  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_1,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -device qemu-xhci,id=usb1,bus=pci.0,addr=0x4 \
    -device pci-bridge,id=pci_bridge1,bus=pci.0,addr=0x3,chassis_nr=1 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/ngu/$1 \
    -device virtio-blk-pci,addr=0x7,id=image1,drive=drive_image1,bootindex=0,bus=pci_bridge1 \
    -device virtio-net-pci,mac=9a:9a:cb:b0:29:31,id=idmp8SrA,netdev=idlfo6Y1,bus=pci.0  \
    -netdev tap,id=idlfo6Y1,vhost=on \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :10  \
    -rtc base=utc,clock=host  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -serial unix:/tmp/test1,server,nowait \
    -monitor stdio

2. Connect guest serial and check the output:
# nc -U /tmp/test1
......


Actual results:
The booting up process hangs.

Expected results:
The guest could boot up.

Additional info:
I am wondering it's a regression caused by the new SLOF-20191022-1.git899d9883.module+el8.2.0+4793+b09dd2fb, will check it later.

Comment 2 David Gibson 2019-12-03 01:02:16 UTC
Chances are very high this is ppc64 specific.

I don't know that the problem will be in the new SLOF as such, but it could well be due to the changes in how we generate the device tree, which involved co-ordinated changes in both qemu and SLOF.

Can you confirm this is a regression from the RHEL-AV-8.1 versions of qemu and SLOF, taken together?

Comment 3 Gu Nini 2019-12-03 06:42:32 UTC
(In reply to David Gibson from comment #2)
> Chances are very high this is ppc64 specific.
> 
> I don't know that the problem will be in the new SLOF as such, but it could
> well be due to the changes in how we generate the device tree, which
> involved co-ordinated changes in both qemu and SLOF.
> 
> Can you confirm this is a regression from the RHEL-AV-8.1 versions of qemu
> and SLOF, taken together?

I have updated the qemu to following RHELAV 8.1.1 and RHELAV 8.1.0 ones, while keep SLOF-20191022-1.git899d9883.module+el8.2.0+4793+b09dd2fb.noarch unchanged, then do the same test as steps in the bug description part, there is no problem, i.e. guest could boot up successfully. So the bug is a regression in qemu4.2.

RHELAV 8.1.1: qemu-kvm-4.1.0-17.module+el8.1.1+5019+2d64ad78.ppc64le
RHELAV 8.1.0: qemu-kvm-4.1.0-14.module+el8.1.0+4754+8d38b36b

Comment 4 Gu Nini 2019-12-03 06:57:27 UTC
Reset Target Release and Internal Target Release.

Comment 5 Laurent Vivier 2019-12-03 15:59:07 UTC
It's a regression in QEMU, since v4.2.0-rc0.

The regression has been introduced by:

commit e68cd0cb5cf49d334abe17231a1d2c28b846afa2
Author: Alexey Kardashevskiy <aik>
Date:   Mon Sep 2 15:41:16 2019 +1000

    spapr: Render full FDT on ibm,client-architecture-support

    The ibm,client-architecture-support call is a way for the guest to
    negotiate capabilities with a hypervisor. It is implemented as:
    - the guest calls SLOF via client interface;
    - SLOF calls QEMU (H_CAS hypercall) with an options vector from the guest;
    - QEMU returns a device tree diff (which uses FDT format with
    an additional header before it);
    - SLOF walks through the partial diff tree and updates its internal tree
    with the values from the diff.

    This changes QEMU to simply re-render the entire tree and send it as
    an update. SLOF can handle this already mostly, [1] is needed before this
    can be applied. This stores the resulting tree in the spapr machine to have
    the latest valid FDT copy possible (this should not matter much as
    H_UPDATE_DT happens right after that but nevertheless).
    
    The benefit is reduced code size as there is no need for another set of
    DT rendering helpers such as spapr_fixup_cpu_dt().
    
    The downside is that the updates are bigger now (as they include all
    nodes and properties) but the difference on a '-smp 256,threads=1' system
    before/after is 2.35s vs. 2.5s.
  
    [1] https://patchwork.ozlabs.org/patch/1152915/
    
    Signed-off-by: Alexey Kardashevskiy <aik>
    Signed-off-by: David Gibson <david.id.au>

Comment 6 Laurent Vivier 2019-12-03 16:06:22 UTC
Bisected using SLOF git-bcc3c4e5c21a015f

bcc3c4e (tag: qemu-slof-20190911) version: update to 20190911
a09b722 usb-host: Do not override USB node name
ea22160 (tag: qemu-slof-20190827) version: update to 20190827
eeed8a1 libnet: Fix the check of the argument lengths of the "ping" command
44d06f9 fdt: Update phandles after H_CAS
674d0d0 rtas: Reserve space for FWNMI log
7bfe584 (tag: qemu-slof-20190719) version: update to 20190719
5e4ed1f rtas: Integrate RTAS blob
ba1ab36 (tag: qemu-slof-20190703) version: update to 20190703

That includes "fdt: Update phandles after H_CAS"
 [1] https://patchwork.ozlabs.org/patch/1152915/

Comment 7 Laurent Vivier 2019-12-03 18:02:16 UTC
This is only a problem at boot if there is a device behind the pci-bridge.

Kernel can boot if pci-bridge is empty. And then hotplugging device on the pci-bridge works too.

Comment 8 David Gibson 2019-12-04 05:21:45 UTC
I've discussed this with Alexey, and he's working on a fix upstream.  The problem is in SLOF, it turns out.

Comment 9 David Gibson 2019-12-06 01:33:58 UTC
Mirek,

Alas we're going to have to rebase SLOF again for this :/.  I thought we were done, but I was wrong.

Comment 10 Miroslav Rezanina 2019-12-06 04:41:24 UTC
(In reply to David Gibson from comment #9)
> Mirek,
> 
> Alas we're going to have to rebase SLOF again for this :/.  I thought we
> were done, but I was wrong.

As discussed online, we will go can go with backporting. With no downstream changes it's just cherry-picking all relevant fixes and their prereqs.

Comment 11 David Gibson 2019-12-11 01:52:22 UTC
I'm attempting to make a downstream backport of this fix, but I'm getting an error:

$ make -C redhat rh-srpm
make: Entering directory '/home/dwg/src/SLOF/redhat'
--2019-12-11 12:50:38--  https://github.com/aik/SLOF/archive/qemu-slof-20191022.tar.gz
Resolving github.com (github.com)... 13.236.229.21
Connecting to github.com (github.com)|13.236.229.21|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/aik/SLOF/tar.gz/qemu-slof-20191022 [following]
--2019-12-11 12:50:39--  https://codeload.github.com/aik/SLOF/tar.gz/qemu-slof-20191022
Resolving codeload.github.com (codeload.github.com)... 3.105.64.153
Connecting to codeload.github.com (codeload.github.com)|3.105.64.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘qemu-slof-20191022.tar.gz’

qemu-slof-20191022.tar.gz     [  <=>                                 ] 832.97K  3.31MB/s    in 0.2s    

2019-12-11 12:50:39 (3.31 MB/s) - ‘qemu-slof-20191022.tar.gz’ saved [852958]

qemu-slof-20191022.tar.gz sha256sum does not match (expected: fd5546947c40d2cbf8e1e4afd1354780635c94c59f570dd47aa48ad603a4d20b)
make: *** [Makefile:51: rh-prep] Error 1
make: Leaving directory '/home/dwg/src/SLOF/redhat'

Mirek, any ideas?

Comment 17 Gu Nini 2020-01-16 09:54:42 UTC
Created attachment 1652696 [details]
serial_log_when_used_pci_bridge_as_system_disk-01162020

I have tried to verify the bug with the newest qemu/slof version, but it's a pity that there are still problems when start the guest with pci-bridge devices:

1) When start guest with pci-bridge virtio-scsi/virtio-blk-pci disks as the system disk, the guest fails to bootup and enter emergency mode as shows in the attached 'serial_log_when_used_pci_bridge_as_system_disk-01162020'
2) When start guest with pci-bridge virtio-ballon device, it's a failure to balloon the memory in qmp
3) When start guest with pci-bridge virtio-net-pci device, it's a failure to find the virtual network card inside the guest when guest boots up.

Software versions used:
Host kernel: 4.18.0-169.el8.ppc64le
qemu-kvm-4.2.0-6.module+el8.2.0+5453+31b2b136.ppc64le
SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch

So let me reopen the bug. Please feel free to contact me if any unclear point.

Comment 19 David Gibson 2020-02-18 04:38:35 UTC
Ok, I've re-investigated this.  It looks like there's a second, independent problem preventing devices from working under bridges (at least virtio devices).

It turns out not to be a SLOF problem at all, but a problem with qemu commit 6c3829a2 "spapr_pci: Advertise BAR reallocation capability".

Comment 20 David Gibson 2020-02-18 05:04:53 UTC
I've filed bug 1804038 to track the new problem.  Moving this back to MODIFIED state, since the necessary SLOF change is included, but adding dependency since it can't be (easily) tested until 1804038 is fixed.

Comment 21 Gu Nini 2020-02-25 11:09:41 UTC
According to https://bugzilla.redhat.com/show_bug.cgi?id=1804038#c13, the bug is verified. So change the status.

Comment 23 errata-xmlrpc 2020-05-05 09:51:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017


Note You need to log in before you can comment on or make changes to this bug.