Bug 2052424

Summary: Hostdev interface can not be plugged back after unplugged with failover setting
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: qemu-kvmAssignee: Laurent Vivier <lvivier>
qemu-kvm sub component: Networking QA Contact: Yanhui Ma <yama>
Status: CLOSED MIGRATED Docs Contact: Jiri Herrmann <jherrman>
Severity: unspecified    
Priority: medium CC: aadam, chayang, coli, jherrman, jinzhao, juzhang, lvivier, virt-maint, wquan, yama, yanghliu, yanqzhan
Version: 9.0Keywords: MigratedToJIRA, Regression, Reopened, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
.A `hostdev` interface with failover settings cannot be hot-plugged after being hot-unplugged After removing a `hostdev` network interface with failover configuration from a running virtual machine (VM), the interface currently cannot be re-attached to the same running VM.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 16:14:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2027125    

Description yalzhang@redhat.com 2022-02-09 09:11:27 UTC
Description of problem:
Hostdev interface can not be plugged back after unplugged with failover setting

Version-Release number of selected component (if applicable):
libvirt-8.0.0-3.el9.x86_64
qemu-kvm-6.2.0-7.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start vm with failover interface setting:

# virsh dumpxml rhel9 | grep /interface -B9
      ......
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='hostbridge'/>
      <model type='virtio'/>
      <teaming type='persistent'/>
      <alias name='ua-backup0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:aa:1c:ef'/>
      <source network='hostdev-net'/>
      <teaming type='transient' persistent='ua-backup0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </interface>

2. Login the vm and check there are 3 interfaces and the network works well;

3. Hot unplug the hostdev device, all things are expected, unplug succeed and the network works well;

Dump the hostdev interface xml as below:
# cat hostdevinterface.xml
<interface type='hostdev' managed='yes'>
      <mac address='52:54:00:aa:1c:ef'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
      </source>
      <teaming type='transient' persistent='ua-backup0'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </interface>

# virsh detach-device rhel9 hostdevinterface.xml
Device detached successfully

on guest, check network works well:
# ping www.baidu.com
...
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=151 ttl=48 time=2.97 ms
[  206.393009] pcieport 0000:00:02.3: pciehp: Slot(0-3): Attention button pressed
[  206.394346] pcieport 0000:00:02.3: pciehp: Slot(0-3): Powering off due to button press
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=152 ttl=48 time=3.04 ms
...
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=156 ttl=48 time=30.4 ms
[  211.519313] virtio_net virtio1 enp1s0: failover primary slave:enp4s0 unregistered
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=158 ttl=48 time=3.14 ms
...
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=221 ttl=48 time=3.03 ms
64 bytes from 182.61.200.7 (182.61.200.7): icmp_seq=222 ttl=48 time=3.05 ms

4. Tried to hot plug the hostdev interface back, but it failed;
# virsh attach-device rhel9 hostdevinterface.xml
error: Failed to attach device from hostdevinterface.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'hostdev0' for device

Update the xml to delete the alias, it still fails:
# cat hostdevinterface2.xml
<interface type='hostdev' managed='yes'>
      <mac address='52:54:00:aa:1c:ef'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
      </source>
      <teaming type='transient' persistent='ua-backup0'/>
    </interface>

# virsh attach-device rhel9 hostdevinterface.xml
error: Failed to attach device from hostdevinterface.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'hostdev0' for device

Actual results:
Hostdev interface can not be plugged back after unplugged with failover setting

Expected results:
It should succeed to hot plug back the interface

Additional info:
No such issue with qemu-kvm-6.1.0-8.el9.x86_64

Comment 1 Yanhui Ma 2022-02-10 06:53:06 UTC
qemu-kvm also can reproduce the issue.
qemu-kvm-6.2.0-4.el9.x86_64

qemu cmd line:

-device virtio-net-pci,mac=36:d8:d5:83:4c:0b,id=idHE7r7C,netdev=idXVstXA,bus=pcie-root-port-3,addr=0x0,failover=on  \
-netdev tap,id=idXVstXA,vhost=on,script=/etc/qemu-ifup-private \
-device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
-device vfio-pci,host=0000:06:0a.0,id=idXVst,bus=pcie-root-port-4,failover_pair_id=idHE7r7C \



{"execute": "device_del","arguments":{"id":"idXVst"}}  
{"return": {}}
{"timestamp": {"seconds": 1644474150, "microseconds": 399755}, "event": "DEVICE_DELETED", "data": {"device": "idXVst", "path": "/machine/peripheral/idXVst"}}
{"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:06:0a.0","id":"idXVst","failover_pair_id":"idHE7r7C","bus":"pcie-root-port-4"}}
{"error": {"class": "GenericError", "desc": "Duplicate ID 'idXVst' for device"}}

Comment 2 Yanhui Ma 2022-02-10 07:00:19 UTC
If hotplugging vf with a different id after unplugging vf, it will print following error:

{"execute": "device_del","arguments":{"id":"idXVst"}}
{"return": {}}
{"timestamp": {"seconds": 1644476217, "microseconds": 527371}, "event": "DEVICE_DELETED", "data": {"device": "idXVst", "path": "/machine/peripheral/idXVst"}}
{"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:06:0a.0","id":"hostnet0","failover_pair_id":"idHE7r7C","bus":"pcie-root-port-4"}}
{"error": {"class": "GenericError", "desc": "Cannot attach more than one primary device to 'idHE7r7C': 'idXVst' and 'hostnet0'"}}

Comment 7 Yanqiu Zhang 2022-08-02 05:42:58 UTC
Also reproduces for latest rhel8.7:
libvirt-8.0.0-10.module+el8.7.0+16047+746a126c.x86_64
qemu-kvm-6.2.0-18.module+el8.7.0+15999+d24f860e.x86_64

# virsh detach-device avocado-vt-vm1 hostdev.xml 
Device detached successfully

# virsh attach-device avocado-vt-vm1 hostdev.xml 
error: Failed to attach device from hostdev.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'hostdev0' for device

Comment 11 RHEL Program Management 2023-08-09 07:28:22 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 13 Yanhui Ma 2023-08-11 02:25:16 UTC
Although the priority of the failover bug is low, it is a real bug and can be reproduced by libvirt. So it is valuable for us to continue to track.

Comment 14 RHEL Program Management 2023-09-22 16:11:39 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 15 RHEL Program Management 2023-09-22 16:14:27 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.