Bug 1003537

Summary: libvirt crashed after destroy guest which hot-unplug the interface with hostdev network
Product: Red Hat Enterprise Linux 7 Reporter: Xuesong Zhang <xuzhang>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: acathrow, dyuan, honzhang, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.1.1-4.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 11:07:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirt.log
none
backtrace
none
core dump file of libvirtd none

Description Xuesong Zhang 2013-09-02 10:01:38 UTC
Description of problem:
libvirt crashed after destroy guest which hot-unplug the interface with hostdev network.

Add the libvirt.log and bacctrace to the attachment.

Version-Release number of selected component (if applicable):
libvirt-1.1.1-3.el7.x86_64
qemu-kvm-1.5.3-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare one healthy guest.
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     r7                             running

2. prepare one network with hostdev forward mode 
# virsh net-dumpxml hostdev-net1
<network connections='1'>
  <name>hostdev-net1</name>
  <uuid>6208c354-0713-4d90-b12c-a893546e44ae</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' function='0x1'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x0'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x1'/>
  </forward>
</network>

# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 br1                  inactive   no            yes
 default              active     yes           yes
 hostdev-net1         active     no            yes

3. check the vf in the host
# lspci|grep 82576
0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0f:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
10:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
10:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
11:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)

4. prepare one xml like the following one:
# cat vfpool.xml 
<interface type='network'>
   <source network='hostdev-net1'/>
</interface>

5. hot-plug the vf to the guest.
# virsh attach-device r7 vfpool.xml 
Device attached successfully

6. make sure the vf is attached.
# virsh dumpxml r7
.......
<interface type='network'>
      <mac address='52:54:00:67:c2:3f'/>
      <source network='hostdev-net1'/>
      <model type='rtl8139'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>
......

7. new one vf xml for detach, the content is same with the one in dumpxml.
# cat vf-detach.xml 
<interface type='network'>
      <mac address='52:54:00:67:c2:3f'/>
      <source network='hostdev-net1'/>
      <model type='rtl8139'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>

8. detach the vf from guest.
# virsh detach-device r7 vf-detach.xml 
Device detached successfully

9. destroy the guest.
# virsh destroy r7
error: Failed to destroy domain r7
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor


Actual result:
libvirtd crashed in step9.

Expected result:
libvirtd should be running.

Comment 1 Xuesong Zhang 2013-09-02 10:02:24 UTC
Created attachment 792784 [details]
libvirt.log

Comment 2 Xuesong Zhang 2013-09-02 10:03:13 UTC
Created attachment 792785 [details]
backtrace

Comment 4 Peter Krempa 2013-09-02 13:47:16 UTC
Could you please also provide a core dump from the crashed daemon?

Thanks

Comment 5 Xuesong Zhang 2013-09-03 09:13:07 UTC
Created attachment 793075 [details]
core dump file of libvirtd

Comment 6 Xuesong Zhang 2013-09-03 09:16:18 UTC
hi,

  I upload the core dump file and the related files while libvirtd crashed.
Please let me know if you need any other info.

(In reply to Peter Krempa from comment #4)
> Could you please also provide a core dump from the crashed daemon?
> 
> Thanks

Comment 7 Peter Krempa 2013-09-03 15:29:18 UTC
Patch posted upstream http://www.redhat.com/archives/libvir-list/2013-September/msg00159.html

Comment 9 Peter Krempa 2013-09-05 08:00:12 UTC
Fixed upstream with two commits:

commit a3d24862df9d717b95fae3951019fa150a9d4e09
Author: Peter Krempa <pkrempa>
Date:   Wed Sep 4 17:32:12 2013 +0200

    conf: Don't deref NULL actual network in virDomainNetGetActualHostdev()
    
    In commit 991270db99690 I've used virDomainNetGetActualHostdev() to get
    the actual hostdev from a network when removing the network from the
    list to avoid leaving the hostdev in the list. I didn't notice that this
    function doesn't check if the actual network is allocated and
    dereferences it. This crashes the daemon when cleaning up a domain
    object in early startup phases when the actual network definition isn't
    allocated. When the actual definition isn't present, the hostdev that
    might correspond to it won't be present anyways so it's safe to return
    NULL.
    
    Thanks to Cole Robinson for noticing this problem.


commit 991270db9969026876c3f5911143dab13ab9050d
Author: Peter Krempa <pkrempa>
Date:   Tue Sep 3 17:25:56 2013 +0200

    conf: Remove the actual hostdev when removing a network
    
    Commit 50348e6edfa reused the code to remove the hostdev portion of a
    network definition on multiple places but forgot to take into account
    that sometimes the "actual" network is passed and in some cases the
    parent of that.
    
    This patch uses the virDomainNetGetActualHostdev() helper to acquire the
    correct pointer all the time while removing the hostdev portion from the
    list.

Comment 10 Xuesong Zhang 2013-09-09 07:06:47 UTC
Verify this bug with build libvirt-1.1.1-4.el7, the bug is fixed, no libvirtd crashed now.

Steps:
Steps to Reproduce:
1. prepare one healthy guest.
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     a                             running

2. prepare one network with hostdev forward mode 
# virsh net-dumpxml hostnet
<network>
  <name>hostnet</name>
  <uuid>c1fb4ead-21b8-4d69-8ad9-669c55b3dfc7</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x1'/>
    <address type='pci' domain='0x0000' bus='0x11' slot='0x10' function='0x0'/>
  </forward>
</network>


# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 hostnet              active     no            yes

3. check the vf in the host
# lspci|grep 82576
0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
0f:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
0f:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
10:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
10:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
11:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
11:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)

4. prepare one xml like the following one:
# cat vfpool.xml 
<interface type='network'>
   <source network='hostnet'/>
</interface>

5. hot-plug the vf to the guest.
# virsh attach-device a vfpool.xml 
Device attached successfully

6. make sure the vf is attached.
# virsh dumpxml a
.......
<interface type='network'>
      <mac address='52:54:00:67:c2:3f'/>
      <source network='hostnet'/>
      <model type='rtl8139'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>
......

7. new one vf xml for detach, the content is same with the one in dumpxml.
# cat vf-detach.xml 
<interface type='network'>
      <mac address='52:54:00:67:c2:3f'/>
      <source network='hostnet'/>
      <model type='rtl8139'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </interface>

8. detach the vf from guest.
# virsh detach-device a vf-detach.xml 
Device detached successfully

9. destroy the guest.
# virsh destroy a
Domain a destroyed

Comment 11 Ludek Smid 2014-06-13 11:07:25 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.