Bug 1020135

Summary: Libvirt should not reduce the connections value on hostdev network after attaching interface failed
Product: Red Hat Enterprise Linux 7 Reporter: Hu Jianwei <jiahu>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, dallan, dyuan, honzhang, jishao, laine, mzhan, shyu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.1.1-12.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1300843 (view as bug list) Environment:
Last Closed: 2014-06-13 10:33:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hu Jianwei 2013-10-17 06:11:57 UTC
Description of problem:
Libvirt should not reduce the connections value on hostdev network after attaching interface failed

Version-Release number of selected component (if applicable):
libvirt-1.1.1-9.el7.x86_64
qemu-kvm-rhev-1.5.3-9.el7.x86_64
kernel-3.10.0-33.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Add four VF interfaces to VF pool network "hostnet"
[root@sriov2 ~]# virsh net-dumpxml hostnet
<network>
  <name>hostnet</name>
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x1'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x2'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x3'/>
  </forward>
</network>

2. Cold-plug two "hostnet" network interfaces to domain
[root@sriov2 ~]# virsh dumpxml r7 | grep interface -A10
    <interface type='network'>
      <mac address='52:54:00:8e:2b:e9'/>
      <source network='hostnet'/>
      <model type='rtl8139'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:01:97:b7'/>
      <source network='hostnet'/>
      <model type='rtl8139'/>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>

3. Start domain and check connections of "hostnet", the connections is 2.
[root@sriov2 ~]# virsh start r7
Domain r7 started

[root@sriov2 ~]#
[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='2'>                         <=========it's 2.
  <name>hostnet</name>
...

4. Attach the rest of the VFs using attach-interface, check the connections of "hostnet", it's 4.
[root@sriov2 ~]# virsh attach-interface r7 network hostnet
Interface attached successfully
[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='4'>
  <name>hostnet</name>
...

5. Due to there are only 4 VFs in "hostnet" network, we just cold-plug/hot-plug four times. 
[root@sriov2 ~]# virsh attach-interface r7 network hostnet
error: Failed to attach interface
error: internal error: network 'hostnet' requires exclusive access to interfaces, but none are available

[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='3'>                                              <===== should be 4, do not reduce to 3.
  <name>hostnet</name>
...

6. Continue to exectute attch-interface more than three times, the value of connetctions lost in "hostnet" network.
[root@sriov2 ~]# virsh net-dumpxml hostnet
<network>
  <name>hostnet</name>                                             <===== should be 4, do not reduce to 0.                
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
...

Actual results:
See step 5 and step 6 for detail.

Expected results:
The connections value should be the same as before when failing to attach interface.

Comment 2 Laine Stump 2013-10-18 10:14:38 UTC
I can see where the problem is, but need to think awhile about the cleanest way to fix it.

Explanation:

qemuDomainAttachNetDevice() calls networkAllocateActualDevice() to allocate a device from the pool. If that call fails, qemuDomainAttachNetDevice() goes to its cleanup: label, and ends up calling networkReleaseActualDevice(). The problem is that when networkReleaseActualDevice() sees that the netdev's "actual" pointer is NULL, it says "oh, so there is nothing to release!" and jumps down to its success: label, where the connections count for the network is decremented. Most of the time this is what should be done, but there are times when networkAllocateActualDevice() has failed but networkReleaseActualDevice() is still called as part of the error cleanup, and in that case the count *shouldn't* be decremented.

Comment 3 Laine Stump 2013-11-06 11:49:34 UTC
A fix has been pushed upstream:

commit b4e0299d4ff059c8707a760b7ec2063ccd57cc21
Author: Laine Stump <laine>
Date:   Mon Nov 4 17:01:17 2013 +0200

    network: fix connections count in case of allocate failure

Comment 6 hongming 2013-11-13 08:53:26 UTC
Verify it as follows. The result is expected. 

Version
libvirt-1.1.1-12.el7.x86_64

[root@sriov2 ~]# virsh net-start hostnet
Network hostnet started

[root@sriov2 ~]# virsh attach-interface r7 network hostnet
Interface attached successfully

[root@sriov2 ~]# virsh attach-interface r7 network hostnet
Interface attached successfully

[root@sriov2 ~]# virsh attach-interface r7 network hostnet
Interface attached successfully

[root@sriov2 ~]# virsh attach-interface r7 network hostnet
Interface attached successfully

[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='4'>
  <name>hostnet</name>
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x2'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x6'/>
  </forward>
</network>

[root@sriov2 ~]# virsh attach-interface r7 network hostnet
error: Failed to attach interface
error: internal error: network 'hostnet' requires exclusive access to interfaces, but none are available

[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='4'>
  <name>hostnet</name>
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x2'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x6'/>
  </forward>
</network>

[root@sriov2 ~]# virsh detach-interface r7 network --mac 52:54:00:78:02:07
Interface detached successfully

[root@sriov2 ~]# virsh net-dumpxml hostnet
<network connections='3'>
  <name>hostnet</name>
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x2'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x4'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x6'/>
  </forward>
</network>

Comment 7 Ludek Smid 2014-06-13 10:33:41 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Comment 9 Jingjing Shao 2016-01-18 09:18:09 UTC
Hi Laine,
The bug can be reproduced rhel6.7 with libvirt-0.10.2-55.el6.x86_64
Is it need be fixed on rhel6.8 or not?

Comment 10 Laine Stump 2016-01-18 15:47:23 UTC
Yes, and as a matter of fact I had backported the patch locally last week as part of a scratch build, and it's on my list to clone the bug to RHEL6 this week.