Bug 1001881

Summary: Fail to start guest after cold-plugging more than one VF from a pool of SRIOV VFs
Product: Red Hat Enterprise Linux 6 Reporter: hongming <honzhang>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.5CC: acathrow, anande, dallan, dkelson, dyuan, honzhang, jdenemar, mzhan, tlavigne, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.2-29.el6 Doc Type: Bug Fix
Doc Text:
Cause: libvirt was erroneously attempting to use the same "alias" name for multiple hostdev network devices. Consequence: it was impossible to start a guest that had more than one hostdev network device in its configuration. Fix: libvirt now assures that each device has a different alias name. Result: It is now possible to start a guest that has multiple hostdev network devices in its configuration.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-21 09:09:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description hongming 2013-08-28 03:13:59 UTC
Description of problem:
Fail to start guest after cold-plugging more than one VF from a pool of SRIOV VFs.The following error occurs.

error: Failed to start domain r6
error: internal error Process exited while reading console log output: qemu-kvm: -device pci-assign,host=11:10.4,id=hostdev0,configfd=24,bus=pci.0,addr=0x6: Duplicate ID 'hostdev0' for device

If cold-plug one VF or hot-plug VFs , it works fine.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-23.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
# virsh net-list
Name                 State      Autostart     Persistent
--------------------------------------------------
hostdev-net1         active     no            yes

# virsh net-dumpxml hostdev-net1
<network>
  <name>hostdev-net1</name>
  <uuid>110794e9-387e-26d9-58d8-561f71af5961</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x3'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x5'/>
  </forward>
</network>

# virsh destroy r6
Domain r6 destroyed

# virsh attach-device r6 vfpool.xml --config
Device attached successfully

# virsh attach-device r6 vfpool.xml --config
Device attached successfully

# virsh start r6
error: Failed to start domain r6
error: internal error Process exited while reading console log output: qemu-kvm: -device pci-assign,host=11:10.4,id=hostdev0,configfd=24,bus=pci.0,addr=0x6: Duplicate ID 'hostdev0' for device


Actual results:
Fail to start guest after cold-plugging more than one VF from a pool of SRIOV VFs 

Expected results:
A guest can successfully start after cold-plugging more than one VF from a pool
of SRIOV VFs .

Additional info:

Comment 2 Jiri Denemark 2013-08-28 11:02:13 UTC
Could you give us the content of vfpool.xml and the domain XML before you try to start it?

Comment 3 Laine Stump 2013-08-28 11:25:14 UTC
Also please attach the final qemu commandline in /var/log/libvirt/qemu/vfpool.log

This sounds familliar. I'm fairly certain this was found and fixed upstream, but haven't found the patch yet (It could also be that I'm just thinking of Bug 827519, which has similarities, but is a different problem).

Comment 4 hongming 2013-08-29 03:35:33 UTC
[root@sriov2 images]# virsh net-list 
Name                 State      Autostart     Persistent
--------------------------------------------------
hostdev-net1         active     no            yes

[root@sriov2 images]# virsh net-dumpxml hostdev-net1
<network>
  <name>hostdev-net1</name>
  <uuid>110794e9-387e-26d9-58d8-561f71af5961</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x3'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x5'/>
  </forward>
</network>

[root@sriov2 images]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     r6                             shut off

[root@sriov2 images]# virsh dumpxml r6
<domain type='kvm'>
  <name>r6</name>
  <uuid>60f88110-4c02-0358-46eb-afdce5dc1002</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/libvirt/images/kvm-rhel6.3-i386-raw.img'/>
      <target dev='hda' bus='ide'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <sound model='ich6'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </sound>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
</domain>

[root@sriov2 images]# cat vfpool.xml 
<interface type='network'>
   <source network='hostdev-net1'/>
</interface>

[root@sriov2 images]# virsh attach-device r6 vfpool.xml --config
Device attached successfully

[root@sriov2 images]# virsh attach-device r6 vfpool.xml --config
Device attached successfully

[root@sriov2 images]# virsh start r6
error: Failed to start domain r6
error: internal error Process exited while reading console log output: qemu-kvm: -device pci-assign,host=11:10.4,id=hostdev0,configfd=24,bus=pci.0,addr=0x6: Duplicate ID 'hostdev0' for device

[root@sriov2 images]# cat /var/log/libvirt/qemu/r6.log
2013-08-29 02:33:22.328+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name r6 -S -M rhel6.5.0 -enable-kvm -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 60f88110-4c02-0358-46eb-afdce5dc1002 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/r6.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/kvm-rhel6.3-i386-raw.img,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device pci-assign,host=11:10.3,id=hostdev0,configfd=23,bus=pci.0,addr=0x3 -device pci-assign,host=11:10.4,id=hostdev0,configfd=24,bus=pci.0,addr=0x6 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
qemu-kvm: -device pci-assign,host=11:10.4,id=hostdev0,configfd=24,bus=pci.0,addr=0x6: Duplicate ID 'hostdev0' for device
2013-08-29 02:33:22.495+0000: shutting down

Comment 7 Dax Kelson 2013-10-01 14:08:21 UTC
I also encountered this bug when I already a VF NIC assigned to guest, and then attempted to add an additional PCI passthrough NIC (that isn't SR-IOV capable, thus the PCI passthrough). This was on RHEL6.4.

Comment 9 Laine Stump 2013-10-02 11:11:07 UTC
I've verified this problem doesn't exist in current upstream libvirt (I can boot a guest that has 1 or 2 or more <interface type='network'> devices pointing to a pool of VFs, then use attach-device to add more, and all operations complete without error). So I *wasn't* misremembering  that the problem had been encountered and fixed.

So now the task is to find the patch that fixed it, and backport that patch.

Comment 12 Laine Stump 2013-10-04 14:09:10 UTC
I *think* this is the patch that eliminated the problem - at least it removes the code that is creating the error condition in 0.10.2:

commit 8cd40e7e0d92a0edbe08941fdf728a81c2e6cf15
Author: Laine Stump <laine>
Date:   Mon May 6 15:43:56 2013 -0400

    qemu: allocate network connections sooner during domain startup
    

In spite of the rather long commit message that discusses why the patch is needed for VFIO support (which doesn't exist in 0.10.2), it is fairly straighforward, and was constructed to minimize change and likelyhood of backport merge conflicts. I still want to go through the implications by hand and with some testing to verify there are no unwanted side effects, though.

Comment 13 Laine Stump 2013-10-04 15:14:32 UTC
Yes, that is the patch. I've tested it for both hot-plug and "cold-plug" of multiple <interface type='network'> devices that resolve to a hostdev network, and in all cases it works properly. I also tested hot and cold plug of additional emulated devices and they have not been adversely affected.

Comment 14 Laine Stump 2013-10-07 10:20:34 UTC
Here is a much simpler patch (one line) that also fixes this bug. It is RHEL6-specific because it fixes the code that has been removed upstream rather than eliminating it.

I have tested both the upstream patch and this patch with "many different combinations" of interface types at startup; both patches fixed the bug described here, and neither of them introduced any regressions. So it appears that either patch would be safe to apply, but this newer patch removes all doubt of created collateral regression damage:

commit 08439fcb7768590969c31240b0c1077783b7340c
Author: Laine Stump <laine>
Date:   Mon Oct 7 05:35:25 2013 -0400

    qemu: generate correct name for hostdev network devices
    
    This patch resolves:
    
       https://bugzilla.redhat.com/show_bug.cgi?id=1001881
    
    in the simplest manner possible. This problem was fixed upstream with
    commit 8cd40e7e0d92a0edbe08941fdf728a81c2e6cf15, which restructured
    the code enough to completely eliminate the erroneous passage. This
    patch instead makes a small change to that code, thus minimizing the
    possibility of inadvertant regression.
    
    The problem that is solved is that, during qemuBuildCommandline(),
    qemuAssignDeviceHostdevAlias() had been called with an index of
    "def->nhostdevs-1", when it really should have been called with
    "def->nhostdevs" (because the hostdev for which the alias was being
    created hadn't yet been added to the hostdev array). The original code
    just coincidentally worked if there was only a single interface
    defined (since the argument in that case would be "0 - 1", and an
    index of -1 means "auto-determine an appropriate index").
    
    Since the offending code has been removed upstream, this patch is
    *not* cherry picked from upstream, as it is unnecessary.

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index b8af9e3..47b5c56 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -6005,7 +6005,7 @@ qemuBuildCommandLine(virConnectPtr conn,
                      * add the newly minted hostdev to the hostdevs array.
                      */
                     if (qemuAssignDeviceHostdevAlias(def, hostdev,
-                                                     (def->nhostdevs-1)) < 0) {
+                                                     def->nhostdevs) < 0) {
                         goto error;
                     }

Comment 22 Xuesong Zhang 2013-10-11 09:30:33 UTC
Verify this bug on the latest build: libvirt-0.10.2-29.el6.x86_64
The result is as expected, so change the bug status to verify.
Steps:
1. prepare one hostdev network:
# virsh net-dumpxml hostnet
<network>
  <name>hostnet</name>
  <uuid>6b49be3c-bb91-c16d-b475-2929678720f4</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x2'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x5'/>
  </forward>
</network>

2. prepare one shutoff guest
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     a                              shut off

3. prepare the interface for assignment
# cat vfpool.xml 
<interface type='network'>
      <source network='hostnet'/>
    </interface>

4. cold-plug the vf to the guest
# virsh attach-device a vfpool.xml --config
Device attached successfully

# virsh attach-device a vfpool.xml --config
Device attached successfully

5. start the guest
# virsh start a
Domain a started

6. check the dumpxml
......
 <interface type='network'>
      <mac address='52:54:00:d1:62:b5'/>
      <source network='hostnet'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:40:65:2f'/>
      <source network='hostnet'/>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>
......

Comment 24 errata-xmlrpc 2013-11-21 09:09:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1581.html