This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1459755 - [ppc64le] Guest fails to boot up if attach usb-storage device to the second pci-bridge
[ppc64le] Guest fails to boot up if attach usb-storage device to the second p...
Status: ON_QA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: SLOF (Show other bugs)
7.4
ppc64le Unspecified
unspecified Severity unspecified
: rc
: 7.5
Assigned To: Thomas Huth
Qunfang Zhang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-08 02:09 EDT by yilzhang
Modified: 2017-10-09 02:49 EDT (History)
8 users (show)

See Also:
Fixed In Version: SLOF-20170724-1.git89f519f.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Guest's booting result (30.42 KB, image/png)
2017-06-09 02:20 EDT, yilzhang
no flags Details

  None (edit)
Description yilzhang 2017-06-08 02:09:41 EDT
Description of problem:
Guest cannot boot up if attach usb-storage to the second pci-bridge, but there is no problem if attach the usb-storage to the first pci-bridge.

Version-Release number of selected component (if applicable):
Host kernel: kernel-3.10.0-677.el7
qemu: qemu-kvm-rhev-2.9.0-7.el7
SLOF-20170303-4.git66d250e.el7

How reproducible: 100%


Steps to Reproduce:
1. Boot up guest with the following command:
/usr/libexec/qemu-kvm \
 -smp 8,sockets=2,cores=2,threads=2 \
 -m 8192 \
-serial unix:/tmp/ttyS0,server,nowait \
 -rtc base=localtime,clock=host \
-vga std  -nodefaults \
 -boot menu=on \
 -monitor stdio \
 -vnc 0:90 \
 -qmp tcp:0:9990,server,nowait \
\
 -device virtio-scsi-pci,id=scsi0 \
 -drive file=rhel6.9,format=raw,id=drive_sysdisk,if=none,cache=none,aio=native,werror=report,rerror=report \
 -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \
\
-device pci-bridge,bus=pci.0,id=bridge1,chassis_nr=1 \
-drive file=image/storage0,if=none,id=drive_blk,format=raw,cache=none,werror=report,rerror=report \
-device virtio-blk-pci,drive=drive_blk,id=device_blk,bus=bridge1,addr=0x1 \
\
-device pci-bridge,bus=pci.0,id=bridge2,chassis_nr=2 \
-device virtio-scsi-pci,id=scsi1,bus=bridge2,addr=0x2 \
-drive file=image/storage1,if=none,id=drive_scsi,format=raw,cache=none,werror=report,rerror=report \
-device scsi-hd,drive=drive_scsi,id=device_scsi,bus=scsi1.0 \
\
-device pci-bridge,bus=pci.0,id=bridge3,chassis_nr=3 \
-device nec-usb-xhci,id=xhci1,bus=bridge2,addr=0x3 \
-drive file=image/storage2,if=none,id=drive_usb,format=raw,cache=none,werror=report,rerror=report \
-device usb-storage,drive=drive_usb,id=device_usb,bus=xhci1.0

2. Check the graphics in vnc



Actual results:
Guest fails to boot up

Expected results:
It should show the booting process

Additional info:
Comment 2 yilzhang 2017-06-09 02:20 EDT
Created attachment 1286305 [details]
Guest's booting result
Comment 3 David Gibson 2017-06-11 21:58:36 EDT
Looks like a SLOF bug, assigning to Thomas.
Comment 4 David Gibson 2017-06-11 22:40:19 EDT
AIUI RHV doesn't use PCI bridges on Power anyway, and this is not a regression.

Therefore, punting to 7.5.
Comment 5 Thomas Huth 2017-06-12 04:01:25 EDT
That's a weird bug ... apparently SLOF does not like the slot number in the device tree path in this case:

0 > devalias usb0
usb0 : /pci@800000020000000/pci@3/usb@3
0 > dev usb0
   No such device path
0 > dev /pci@800000020000000/pci@3/usb@3
   No such device path
0 > dev /pci@800000020000000/pci@3/usb
   ok
0 > pwd
  /pci@800000020000000/pci@3/usb@3 ok

Not sure how this can happen at all ... I'll have a closer look...
Comment 6 Thomas Huth 2017-06-12 05:54:29 EDT
The issue is causes by a bad setting of the pci-bus-number variable, which confuses the "decode-unit" function so that the device tree node can not be entered when the unit address (like "@3" in above example) has been specified. I've suggested a fix upstream here:
 https://lists.ozlabs.org/pipermail/slof/2017-June/001598.html
Comment 7 yilzhang 2017-08-03 05:34:51 EDT
Hi Thomas,

Do we support attaching guest image to the second pci-bridge (not nested)

I find that guest cannot boot up if the guest's image is attached to the second pci-bridge, qemu cli may be:
/usr/libexec/qemu-kvm \
 -smp 8,sockets=2,cores=4,threads=1 -m 8192 \
 -serial unix:/tmp/myserial.log,server,nowait \
 -nodefaults \
 -rtc base=localtime,clock=host \
 -boot menu=on \
 -monitor stdio \
 -vnc :18 \
 -qmp tcp:0:9999,server,nowait \
 -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \
 -device pci-bridge,id=bridge2,chassis_nr=2,bus=pci.0,addr=0x2 \
 -drive file=rhel.qcow2,if=none,id=drive_sysdisk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
 -device virtio-blk-pci,drive=drive_sysdisk,bus=bridge2,addr=0x3,id=sysdisk,bootindex=0 \



Console output is as follows:
No NVRAM common partition, re-initializing...
Scanning USB 
Using default console: /vdevice/vty@71000000
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@800000020000000/pci-bridge@2/scsi@3 ... 
E3405: No such device
Trying to load:  from: /pci@800000020000000/pci@2/scsi@3 ... 
E3405: No such device
Trying to load:  from: cdrom ... 
E3405: No such device
Trying to load:  from: net ... 
E3405: No such device

E3407: Load failed

  Type 'boot' and press return to continue booting the system.
  Type 'reset-all' and press return to reboot the system.



Note:
Power8+qemu-kvm-rhev-2.9.0-14.el7.ppc64le and Power9+qemu-kvm-2.9.0-19.el7a.ppc64le  both have this issue.
But x86 doesn't have this issue. On x86 platform, guest can boot up.
Comment 8 Thomas Huth 2017-08-03 06:46:46 EDT
Hi Yilin,
since this BZ has been moved to RHEL 7.5, this has not been fixed in the SLOF from RHEL 7.4 yet. But the patch has already been included in upstream:

https://github.com/aik/SLOF/commit/62674aabe20612a9786fa03e87cf6916ba97a99a

... so we'll get the fix with the next rebase.
Comment 9 yilzhang 2017-08-03 21:01:22 EDT
Hi Thomas,

So you mean the failure I just reported has the same root cause as this bug, right? If so, thank you for the detailed explanation.
Comment 10 Thomas Huth 2017-08-04 02:22:00 EDT
Yes, it's the same root cause. Your example from comment 7 works for me when I use the upstream SLOF version, and I just double-checked that it is commit 62674aabe20612a9786fa03e87cf6916ba97a99a which fixes this issue there. So this will be fixed in RHEL 7.5, too.

Note You need to log in before you can comment on or make changes to this bug.