RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1479674 - Start vm after remove some vPHBs will fail at first try
Summary: Start vm after remove some vPHBs will fail at first try
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4-Alt
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: 7.4-Alt
Assignee: Andrea Bolognani
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1440030
TreeView+ depends on / blocked
 
Reported: 2017-08-09 07:20 UTC by Wayne Sun
Modified: 2017-09-21 09:28 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 09:28:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 157817 0 None None None 2017-08-18 16:02:35 UTC

Description Wayne Sun 2017-08-09 07:20:05 UTC
Description of problem:
Start multiple vPHBs vm after edit will fail at first try 

Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm kernel
libvirt-3.2.0-18.el7a.ppc64le
qemu-kvm-2.9.0-20.el7a.ppc64le
kernel-4.11.0-19.el7a.ppc64le

How reproducible:
always

Steps to Reproduce:
1. Define a guest and add pci-bridge with non-zero PCI bus
...
    <controller type='pci' index='3' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='3'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x07' function='0x0'/>
    </controller>
...

2. start vm
the vm will be started with automatically add vPHB 1 and 2 with pci-bridge on index 2 as:
... 
    <controller type='pci' index='0' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='0'/>
    </controller>
    <controller type='pci' index='1' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='1'/>
    </controller>
    <controller type='pci' index='2' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='2'/>
    </controller>
    <controller type='pci' index='3' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='3'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x07' function='0x0'/>
    </controller>
...

3. destroy vm and edit vm with remove vPHB index 1 and 2
# virsh edit vm2
Domain vm2 XML configuration edited.

# virsh dumpxml vm2  
...
    <controller type='pci' index='0' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='0'/>
    </controller>
    <controller type='pci' index='3' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='3'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='1' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='1'/>
    </controller>
    <controller type='pci' index='2' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='2'/>
    </controller>
...

the xml is auto updated with both 1 and 2 vPHB back, difference is xml updated with pci-bridge ahead of vPHB 1 and 2.

3. start vm2
# virsh start vm2                                                               
error: Failed to start domain vm2
error: internal error: qemu unexpectedly closed the monitor: 2017-08-09T10:52:30.963370Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-09T10:52:30.994513Z qemu-kvm: -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2.0,addr=0x7: Bus 'pci.2.0' not found

In qemu vm log:
2017-08-09 10:52:30.520+0000: starting up libvirt version: 3.2.0, package: 18.el7a (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2017-08-03-07:52:53, ppc-059.build.eng.bos.redhat.com), qemu version: 2.9.0(qemu-kvm-2.9.0-20.el7a), hostname: c155f1-u31
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=vm2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-6-vm2/master-key.aes -machine pseries-rhel7.4.0alt,accel=kvm,usb=off,dump-guest-core=off -m size=1048576k,slots=16,maxmem=2621440k -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 38c3f179-fa77-473e-98fd-04e1f17c2ad7 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-6-vm2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2.0,addr=0x7 -device spapr-pci-host-bridge,index=1,id=pci.1 -device spapr-pci-host-bridge,index=2,id=pci.2 -device qemu-xhci,id=usb,bus=pci.0,addr=0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/var/lib/avocado/data/avocado-vt/images/jeos-25-64-clone.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:e5:e5:ef,bus=pci.0,addr=0x1 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-6-vm2/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-kbd,id=input0,bus=usb.0,port=1 -device usb-mouse,id=input1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
2017-08-09T10:52:30.963370Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-09T10:52:30.994513Z qemu-kvm: -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2.0,addr=0x7: Bus 'pci.2.0' not found
2017-08-09 10:52:31.262+0000: shutting down, reason=failed

4. start vm again:
# virsh start vm2
Domain vm2 started

# ps aux|grep qemu
qemu     12405  4.7  2.0 1332544 648832 ?      SLl  06:52   0:41 /usr/libexec/qemu-kvm -name guest=vm2,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-7-vm2/master-key.aes -machine pseries-rhel7.4.0alt,accel=kvm,usb=off,dump-guest-core=off -m size=1048576k,slots=16,maxmem=2621440k -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 38c3f179-fa77-473e-98fd-04e1f17c2ad7 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-7-vm2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device spapr-pci-host-bridge,index=1,id=pci.1 -device spapr-pci-host-bridge,index=2,id=pci.2 -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2.0,addr=0x7 -device qemu-xhci,id=usb,bus=pci.0,addr=0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/var/lib/avocado/data/avocado-vt/images/jeos-25-64-clone.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:e5:e5:ef,bus=pci.0,addr=0x1 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30000000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-7-vm2/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-kbd,id=input0,bus=usb.0,port=1 -device usb-mouse,id=input1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on


# virsh dumpxml vm2
...
    <controller type='pci' index='0' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='0'/>
      <alias name='pci.0'/>
    </controller>
    <controller type='pci' index='1' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='1'/>
      <alias name='pci.1'/>
    </controller>
    <controller type='pci' index='2' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='2'/>
      <alias name='pci.2'/>
    </controller>
    <controller type='pci' index='3' model='pci-bridge'>
      <model name='pci-bridge'/>
      <target chassisNr='3'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x07' function='0x0'/>
    </controller>
...

the xml is auto updated with right sequence

Actual results:
vm failed to start at first try after edit with remove vPHBs and succeed at second time

Expected results:
vm could start at first try

Additional info:

Comment 2 David Gibson 2017-08-21 04:16:46 UTC
AFAICT what's happening here is that when you run the VM with the extra vPHBs, libvirt is assigning some devices on the second vPHB, and updating the XML to give those devices explicit addresses on the second vPHB.  When it goes away, libvirt obviously can't place those devices on the second vPHB any more, hence the failure.

Comment 3 Andrea Bolognani 2017-08-21 08:12:19 UTC
(In reply to David Gibson from comment #2)
> AFAICT what's happening here is that when you run the VM with the extra
> vPHBs, libvirt is assigning some devices on the second vPHB, and updating
> the XML to give those devices explicit addresses on the second vPHB.  When
> it goes away, libvirt obviously can't place those devices on the second vPHB
> any more, hence the failure.

That's not quite what happens: as you can see (step 3 in the
description) the PHBs get re-added automatically, but for
some reason they end up after the devices rather than before
them, and QEMU can't handle having device and controller
specified in that order.

I'll look into making it so the PHBs get re-added before the
devices using them.

Comment 6 Andrea Bolognani 2017-09-05 13:26:13 UTC
Fix posted upstream.

  https://www.redhat.com/archives/libvir-list/2017-September/msg00084.html

Comment 7 Andrea Bolognani 2017-09-07 17:00:51 UTC
v2 patches posted upstream.

  https://www.redhat.com/archives/libvir-list/2017-September/msg00168.html

Comment 8 David Gibson 2017-09-14 00:55:50 UTC
Andrea, any update on getting this upstream and downstream?

Comment 9 Andrea Bolognani 2017-09-21 09:28:26 UTC
(In reply to David Gibson from comment #8)
> Andrea, any update on getting this upstream and downstream?

Sorry for taking so long to reply.

It turns out that reordering controllers can break migration[1],
so given that the problem scenario described above was kinda
convoluted to being with, I think it's better to avoid further
breakage and close the bug as CANTFIX. Doing so now.


[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00734.html


Note You need to log in before you can comment on or make changes to this bug.