Bug 1434706

Summary: [pci-bridge] Hotplug devices to pci-bridge failed
Product: Red Hat Enterprise Linux 7 Reporter: yduan
Component: qemu-kvm-rhevAssignee: Marcel Apfelbaum <marcel>
Status: CLOSED ERRATA QA Contact: yduan
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: ailan, chayang, jinzhao, juzhang, knoel, laine, qzhang, virt-maint, xfu, yduan
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.9.0-7.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-02 03:39:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yduan 2017-03-22 08:29:03 UTC
Description of problem:
Hotplug devices to pci-bridge failed.

Version-Release number of selected component (if applicable):
Host:
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64
# uname -r
3.10.0-623.el7.x86_64
Guest:
3.10.0-623.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with a pci-bridge:
/usr/libexec/qemu-kvm \
 -machine pc \
 ...
 -device pci-bridge,id=bridge1,chassis_nr=1 \
 ...

2.Hotplug a virtio-blk-pci device to under the pci-bridge:
{"execute":"__com.redhat_drive_add", "arguments":{"file":"/home/rhel7.4test/images/stg1.qcow2", "format":"qcow2", "id":"drive_blk"}}
{"execute":"device_add", "arguments":{"driver":"virtio-blk-pci", "drive":"drive_blk", "id":"device_blk", "bus":"bridge1", "addr":1}}

3.Check the new added device inside guest:
# fdisk -l

Actual results:
Hotplugged block device doesn't show up.

Expected results:
Hotplugged block device should show up.

Additional info:
1.Cannot be reproduced with qemu-kvm-rhev-2.8.0-6.el7.x86_64.

2.Hotplugged virtio-net-pci device doesn't show up too.

3.Hotplugged virtio-blk-pci device shows up after guest reboot.

4.'info pci'
Before hotplug:
(qemu) info block
drive_sysdisk (#block172): images/rhel74-64-virtio.qcow2 (qcow2)
    Cache mode:       writeback, direct
(qemu) info pci
...
  Bus  0, device   6, function 0:
    PCI bridge: PCI device 1b36:0001
      BUS 0.
      secondary bus 1.
      subordinate bus 1.
      IO range [0xf000, 0x0fff]
      memory range [0xfff00000, 0x000fffff]
      prefetchable memory range [0xfff00000, 0x000fffff]
      id "bridge1"
...

After hotplug:
(qemu) info block
drive_sysdisk (#block172): images/rhel74-64-virtio.qcow2 (qcow2)
    Cache mode:       writeback, direct

drive_blk (#block321): /home/rhel7.4test/images/stg1.qcow2 (qcow2)
    Cache mode:       writeback
(qemu) info pci
...
  Bus  0, device   6, function 0:
    PCI bridge: PCI device 1b36:0001
      BUS 0.
      secondary bus 1.
      subordinate bus 1.
      IO range [0xf000, 0x0fff]
      memory range [0xfff00000, 0x000fffff]
      prefetchable memory range [0xfff00000, 0x000fffff]
      id "bridge1"
  Bus  1, device   1, function 0:
    SCSI controller: PCI device 1af4:1001
      IRQ 0.
      BAR0: I/O at 0xffffffffffffffff [0x003e].
      BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe].
      BAR4: 64 bit prefetchable memory at 0xffffffffffffffff [0x00003ffe].
      id "device_blk"
...

After guest reboot:
  Bus  0, device   6, function 0:
    PCI bridge: PCI device 1b36:0001
      BUS 0.
      secondary bus 1.
      subordinate bus 1.
      IO range [0xc000, 0xcfff]
      memory range [0xfc000000, 0xfc1fffff]
      prefetchable memory range [0xfe800000, 0xfe9fffff]
      id "bridge1"
  Bus  1, device   1, function 0:
    SCSI controller: PCI device 1af4:1001
      IRQ 11.
      BAR0: I/O at 0xc000 [0xc03f].
      BAR1: 32 bit memory at 0xfc000000 [0xfc000fff].
      BAR4: 64 bit prefetchable memory at 0xfe800000 [0xfe803fff].
      id "device_blk"

Comment 4 Qunfang Zhang 2017-03-23 02:44:27 UTC
Seems a similar bug with the following one: 

Bug 1432891 - The device hot pluged to pci-bridge doesn't show up

Comment 5 yduan 2017-03-23 03:04:46 UTC
(In reply to Qunfang Zhang from comment #4)
> Seems a similar bug with the following one: 
> 
> Bug 1432891 - The device hot pluged to pci-bridge doesn't show up

I think there are some differences.

1.Bug 1432891 - The device hot pluged to pci-bridge doesn't show up
  "Actual results:The device hotplugged didn't show up in the guest or "info qtree" in hmp"
  Bug 1434706 - [pci-bridge] Hotplug devices to pci-bridge failed
  The hotplugged device just only doesn't show up in the guest, it can be seen in QEMU(e.g., 'info qtree', 'info pci', 'info block', 'info network').

2.More important fact:
  Bug 1432891 - The device hot pluged to pci-bridge doesn't show up
  qemu-kvm-rhev-2.8.0-5.el7
  Bug 1434706 - [pci-bridge] Hotplug devices to pci-bridge failed
  Reproducible with "qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64" but not "qemu-kvm-rhev-2.8.0-6.el7.x86_64".

Comment 6 Marcel Apfelbaum 2017-05-08 09:25:09 UTC
Hi,

I found the root cause, I am not sure we should do something about it.

QEMU 2.9 introduced a new feature for PCI bridges : slot 0 is now a hot-pluggable slot. The way it is done is by disabling PCI-bridge's built-in SHPC controller which is not currently used by any arch.

The side effect of not having the controller is that the PCI bridge does not have any devices attached to it at the boot time. (before that we had the shpc on slot 0).
The PCI-bridge spec dictates that if a PCI bridge does not have any devices requesting IO/MEM behind it, the software should not allocate resources for the bridge... so the hotplug we fail.
Conclusion: an empty bridge becomes not usable. (By the way, running the pci bridge with shpc=on will work as before)

I am not sure if we should do something about it. It follows the spec, even is is different from what worked until now.

If libvirt never allocates an empty bridge, we are OK.
Laine, can you please tell us is libvirt allows creation of empty bridges?

Thanks,
Marcel

Comment 7 Marcel Apfelbaum 2017-05-08 11:44:17 UTC
*** Bug 1442371 has been marked as a duplicate of this bug. ***

Comment 8 Laine Stump 2017-05-08 15:38:01 UTC
Yes, libvirt of course will create an empty bridge if a user explicitly requests it. This would be done if someone planned to hotplug "many" devices.

I don't think it's a good idea to prevent such a configuration.

(I still don't see why everyone is so uptight about making slot 0 usable for regular devices - sure it's giving a theoretical 1/32 improvement in PCI density, but libvirt will never use slot 0 of a pci-bridge anyway (for historical / backward compatibility reasons) so in practice we're gaining nothing.)

Comment 9 Marcel Apfelbaum 2017-05-10 14:08:03 UTC
Revert the commit and leave the shpc controller even if is not
actually used by any architecture. Slot 0 remains unusable at boot time.

Patch posted upstream:
http://patchwork.ozlabs.org/patch/760515/

Comment 11 Marcel Apfelbaum 2017-05-16 08:00:36 UTC
*** Bug 1440584 has been marked as a duplicate of this bug. ***

Comment 12 Miroslav Rezanina 2017-05-30 15:03:04 UTC
Fix included in qemu-kvm-rhev-2.9.0-7.el7

Comment 13 yduan 2017-06-05 02:36:28 UTC
Version-Release number of selected component:
Host: 3.10.0-677.el7.x86_64
Guest: 3.10.0-677.el7.x86_64

  This problem can be reproduced with qemu-kvm-rhev-2.9.0-1.el7.x86_64 but not qemu-kvm-rhev-2.9.0-7.el7.x86_64. Steps are same as Comment 0.

Comment 14 yduan 2017-06-05 02:37:23 UTC
Version-Release number of selected component:
Host: 3.10.0-677.el7.x86_64
Guest: 3.10.0-677.el7.x86_64

  This problem can be reproduced with qemu-kvm-rhev-2.9.0-1.el7.x86_64 but not qemu-kvm-rhev-2.9.0-7.el7.x86_64. Steps are same as Comment 0.

Thanks,
yduan

Comment 18 errata-xmlrpc 2017-08-02 03:39:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392