RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1111455 - libvirtd crash when hotplug a nic from macvtap-passthrough network with specified 'pf'
Summary: libvirtd crash when hotplug a nic from macvtap-passthrough network with speci...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-20 05:41 UTC by hongming
Modified: 2014-10-14 04:22 UTC (History)
7 users (show)

Fixed In Version: libvirt-0.10.2-44.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-14 04:22:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
gdb backtrace (3.65 KB, text/plain)
2014-06-20 05:42 UTC, hongming
no flags Details
libvirtd debug log (260.31 KB, text/plain)
2014-06-20 05:44 UTC, hongming
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1374 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2014-10-14 08:11:54 UTC

Description hongming 2014-06-20 05:41:30 UTC
Description of problem:
libvirtd crash when hotplug a nic from macvtap-passthrough network
with specified 'pf'

Version-Release number of selected component (if applicable):
libvirt-0.10.2-38.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
[root@sriov2 images]# virsh net-start passthrough
Network passthrough started

[root@sriov2 images]# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>a0d30d24-96fd-c62c-9de6-48f6afda28f6</uuid>
  <forward mode='passthrough'>
    <pf dev='eth2'/>
  </forward>
</network>


[root@sriov2 images]# virsh net-list
Name                 State      Autostart     Persistent
--------------------------------------------------
default              active     yes           yes
passthrough          active     no            yes

[root@sriov2 images]# virsh list
 Id    Name                           State
----------------------------------------------------
 57    rhel6                          running

[root@sriov2 images]# cat if.xml 
<interface type='network'>
   <source network='passthrough'/>
</interface>

[root@sriov2 images]# virsh attach-device rhel6 if.xml
error: Failed to attach device from if.xml
error: internal error Direct mode types require interface names

[root@sriov2 images]# virsh attach-device rhel6 if.xml
error: Failed to attach device from if.xml
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the
hypervisor
error: Failed to reconnect to the hypervisor


Actual results:
libvirtd crash when hotplug a nic from macvtap-passthrough network
with specified 'pf'


Expected results:
libvirtd works fine


Additional info:

Comment 1 hongming 2014-06-20 05:42:10 UTC
Created attachment 910636 [details]
gdb  backtrace

Comment 2 hongming 2014-06-20 05:44:01 UTC
Created attachment 910637 [details]
libvirtd debug log

Comment 4 Laine Stump 2014-06-25 14:22:59 UTC
There are 3 levels of problem here:

1) Although it makes sense to support the <pf> element for the macvtap network pools, not just for hostdev, that isn't the case - <pf> is only supported for <forward mode='hostdev'>. This is a good feature to add upstream and in RHEL7, but isn't appropriate for the stage of development RHEL6 is in.

2) We should (but don't) log a proper error message and fail when there is an attempt to use <pf> for a network other than hostdev.

3) In spite of that, if someone still manages to get such a network defined, we *really* need to do a better job of error recovery :-)

jdenemar already provided me with a patch for (3) to test, and I will also make a patch for (2).

Comment 5 Laine Stump 2014-06-26 08:50:38 UTC
Actually I spoke too soon - <pf> *is* supported for macvtap network pools. The problem in this case is that the VFs exist, but for some reason don't have network interface names assigned to them. On my RHEL6 system, if the VFs exist at all, they have a network device name. Did you possibly blacklist the igbvf driver?

(BTW, the network device name of a VF can be found in /sys/bus/pci/$pciaddress/net - that directory should contain a subdirectory which is the net device.)

Comment 6 hongming 2014-06-26 09:10:17 UTC
Hi Laine 

I didn't blacklist the igbvf driver. The VFs have interface names in my host. The pf value of the bug is a PF device name in the host.

for example 
# ll /sys/bus/pci/drivers/igbvf/0000:0f:10.0/net
total 0
drwxr-xr-x. 5 root root 0 Jun 26 16:15 eth6


virsh nodedev-list also can get them.

# virsh nodedev-list --tree
computer
......
 |       |   +- pci_0000_0f_10_0
 |       |   |   |
 |       |   |   +- net_eth6_7a_f4_c5_1e_6c_41
......

Comment 7 Laine Stump 2014-06-26 10:33:31 UTC
Okay, I have been able to reproduce the "no net device name" problem on my system. The reason this happens is that one of the VFs is currently not bound to the net driver (igbvf in our case); most commonly this is because it is bound to the pci-stub driver (or for RHEL7 to the vfio-pci driver), which in turn usually means that this VF is already in use for PCI passthrough device assignment either on the current guest or another guest (it could also just mean that someone has manually detached the VF with "virsh nodedev-detach".

At any rate, when this is the case, the device shouldn't be included in the pool (as it's obviously currently or intended to be used for a different purpose).

Comment 8 Laine Stump 2014-08-11 21:57:41 UTC
A fix has been pushed upstream:

commit cd7759cb96db642aaa556f78f15801609885a650
Author: Laine Stump <laine>
Date:   Tue Aug 5 16:40:52 2014 -0400

    network: make networkCreateInterfacePool more robust
    
    networkCreateInterfacePool was a bit loose in its error cleanup, which
    could result in a network definition with interfaces in the pool that
    were NULL. This would in turn lead to a libvirtd crash when a guest
    tried to attach an interface using the network with that pool.
    
    In particular this would happen when creating a pool to be used for
    macvtap connections. macvtap needs the netdev name of the virtual
    function in order to use it, and each VF only has a netdev name if it
    is currently bound to a network driver. If one of the VFs of a PF
    happened to be bound to the pci-stub or vfio-pci driver (indicating
    it's already in use for PCI passthrough), or no driver at all, it
    would have no name. In this case networkCreateInterfacePool would
    return an error, but would leave the netdef->forward.nifs set to the
    total number of VFs in the PF. The interface attach that triggered
    calling of networkCreateInterfacePool (it uses a "lazy fill" strategy)
    would simply fail, but the very next attempt to attach an interface
    using the same network pool would result in a crash.
    
    This patch refactors networkCreateInterfacePool to bring it more in
    line with current coding practices (label name, use of a switch with
    no default case) as well as providing the following two changes to
    behavior:
    
    1) If a VF with no netdev name is encountered, just log a warning and
    continue; only fail if exactly 0 devices are found to put in the pool.
    
    2) If the function fails, clean up any partial interface pool and set
    netdef->forward.nifs to 0.

Comment 10 Laine Stump 2014-08-11 22:09:46 UTC
Notes for testing: To reliably reproduce the crash *without* the patch, and prove that the patch fixes it, use the following sequence:

1) define a network as in the description of this bz (i.e. uses "<forward mode='passthrough'> <pf dev='eth2'/>...". Start this network.

2) define a guest that defines an <interface type='hostdev' managed='yes'> specifying the PCI address of VF 0 of the above PF as its <source>.

3) after starting this guest, hot-plug an interface using if.xml given in this bug's description.

4) Again attempt to hot-plug the same interface.

Without the patch, step 3 will fail, and step 4 will cause libvirtd to crash. With the patch, both step 3 and 4 will succeed (unless there are less than 3 VFs, in which case step 3 or 4 could fail, but libvirtd would not crash)

Comment 12 Hu Jianwei 2014-08-20 07:50:32 UTC
I can not reproduce the bug with below version.

Version:
libvirt-0.10.2-44.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.428.el6.x86_64

<1> Check the VF 0 of SRIOV PF isn't bound to a network driver.
[root@sriov2 jiahu]# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     rhel6                          shut off

[root@sriov2 jiahu]# virsh net-list --all
Name                 State      Autostart     Persistent
--------------------------------------------------
default              active     yes           yes
passthrough          inactive   no            yes

[root@sriov2 jiahu]# virsh dumpxml rhel6 | grep interface -b5
...
1142:    <interface type='hostdev' managed='yes'>
1187-      <mac address='52:54:00:73:a0:1d'/>
1228-      <source>
1243-        <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x1'/>
1327-      </source>
1343-      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
1425:    </interface>
...

[root@sriov2 jiahu]# ip link show eth5
101: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:1b:21:55:b3:bd brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 0e:6e:8a:f9:68:ed
    vf 1 MAC d6:96:02:9b:3d:40
    vf 2 MAC be:7d:5c:78:27:6b
    vf 3 MAC 5e:bf:87:c4:6f:49
    vf 4 MAC ca:f9:79:05:ab:93
    vf 5 MAC 02:b1:28:0b:aa:eb
    vf 6 MAC 02:1c:01:06:d5:75

[root@sriov2 jiahu]# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>dcf96e03-5b98-de86-437a-3fcfe7334019</uuid>
  <forward mode='passthrough'>
    <pf dev='eth5'/>
  </forward>
</network>

[root@sriov2 jiahu]# virsh net-start passthrough
Network passthrough started

[root@sriov2 jiahu]# virsh start rhel6
Domain rhel6 started

[root@sriov2 jiahu]# service libvirtd status
libvirtd (pid  30709) is running...

[root@sriov2 jiahu]# cat if.xml 
<interface type='network'>
   <source network='passthrough'/>
   </interface>

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

Checked the libvirtd.log:
2014-08-20 06:23:42.938+0000: 448: warning : networkCreateInterfacePool:3555 : VF 0 of SRIOV PF eth5 couldn't be added to the interface pool because it isn't bound to a network driver - possibly in use elsewhere

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
error: Failed to attach device from if.xml
error: internal error network 'passthrough' requires exclusive access to interfaces, but none are available

[root@sriov2 jiahu]# service libvirtd status
libvirtd (pid  30709) is running...
[root@sriov2 jiahu]# virsh net-dumpxml passthrough
<network connections='5'>
  <name>passthrough</name>
  <uuid>dcf96e03-5b98-de86-437a-3fcfe7334019</uuid>
  <forward mode='passthrough'>
    <pf dev='eth5'/>
    <interface dev='eth12' connections='1'/>
    <interface dev='eth28' connections='1'/>
    <interface dev='eth29' connections='1'/>
    <interface dev='eth30' connections='1'/>
    <interface dev='eth31' connections='1'/>
    <interface dev='eth34' connections='1'/>
  </forward>
</network>

Check any of macvtap interface in host:
[root@sriov2 jiahu]# virsh dumpxml rhel6 | grep interface -b5
...
2954:    <interface type='direct'>
2984-      <mac address='52:54:00:a4:57:6e'/>
3025-      <source dev='eth34' mode='passthrough'/>
3072-      <target dev='macvtap5'/>
3103-      <alias name='net5'/>
3130-      <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/>
3212:    </interface>
...
[root@sriov2 jiahu]# ip ad | grep macvtap5
76: macvtap5@eth34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500
[root@sriov2 jiahu]# 

<2> Check no usable VF's present
[root@sriov2 jiahu]# rmmod igb
[root@sriov2 jiahu]# modprobe igb max_vfs=0
[root@sriov2 jiahu]# ip link show eth5
97: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:1b:21:55:b3:bd brd ff:ff:ff:ff:ff:ff

[root@sriov2 jiahu]# virsh net-start passthrough
Network passthrough started

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
error: Failed to attach device from if.xml
error: internal error No usable Vf's present on SRIOV PF eth5

[root@sriov2 jiahu]# service libvirtd status
libvirtd (pid  30709) is running...

<3> For common scenario, no any Vfs assigned to running domain
[root@sriov2 jiahu]# virsh start rhel6
Domain rhel6 started

[root@sriov2 jiahu]# virsh dumpxml rhel6 | grep interface -b5
[root@sriov2 jiahu]# ip link show eth5
7: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:1b:21:55:b3:bd brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 1e:06:5a:0f:47:a8
    vf 1 MAC ca:9b:fb:0d:fb:7d
    vf 2 MAC c2:58:73:6b:e2:8b
    vf 3 MAC b2:09:51:b2:e2:c2
    vf 4 MAC 66:57:83:6f:77:a6
    vf 5 MAC e6:cc:7f:4d:71:2a
    vf 6 MAC 5e:36:72:61:dc:7b
[root@sriov2 jiahu]# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>dcf96e03-5b98-de86-437a-3fcfe7334019</uuid>
  <forward mode='passthrough'>
    <pf dev='eth5'/>
  </forward>
</network>

[root@sriov2 jiahu]# virsh net-start passthrough
Network passthrough started

[root@sriov2 jiahu]# service libvirtd status
libvirtd (pid  2653) is running...

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# virsh attach-device rhel6 if.xml
Device attached successfully

[root@sriov2 jiahu]# service libvirtd status
libvirtd (pid  2653) is running...

Comment 14 errata-xmlrpc 2014-10-14 04:22:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1374.html


Note You need to log in before you can comment on or make changes to this bug.