Bug 971325

Summary: libvirtd crashes when starting a guest that uses a hostdev network specifying a nonexistent PF
Product: Red Hat Enterprise Linux 7 Reporter: hongming <honzhang>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: acathrow, dyuan, gsun, jmiao, mzhan, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.1.0-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 980339 (view as bug list) Environment:
Last Closed: 2014-06-13 09:28:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirt debug log none

Description hongming 2013-06-06 09:32:21 UTC
Description of problem:
libvirtd crash when start a guest with inactive network that wrong pf value in


Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
3.9.0-0.55.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
# virsh nodedev-list --tree

computer
  |
......
  +- pci_0000_00_1c_6
  |   |
  |   +- pci_0000_07_00_0
  |       |
  |       +- pci_0000_08_02_0
  |       |   |
  |       |   +- pci_0000_09_00_0
  |       |   |   |
  |       |   |   +- net_p1p1_00_1b_21_55_b3_b8
  |       |   |    
  |       |   +- pci_0000_09_00_1
  |       |   |   |
  |       |   |   +- net_eth3_00_1b_21_55_b3_b9  <======== right device name
  |       |   |    
  |       |   +- pci_0000_0a_10_0
  |       |   |   |
  |       |   |   +- net_p1p1_0_5a_12_12_ed_a5_1b
......

# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth1'/>                            <======== wrong device name 
   </forward>
</network>

# virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml


# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 passthrough          inactive   no            yes


# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>3be5c577-c1bb-4a6d-8641-adda7b2b9b16</uuid>
  <forward mode='hostdev' managed='yes'>
    <pf dev='eth1'/>
  </forward>
</network>


Add passthrough  interface to guest
# virsh edit rhel7 
Domain rhel7 XML configuration edited.
# virsh dumpxml rhel7
<domain type='kvm'>
  <name>rhel7</name>
  ......

    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start rhel7
error: Failed to start domain rhel7
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor



Actual result
libvirtd crash

Expect result
Throw error 
Additional info:

Comment 1 hongming 2013-06-06 09:42:56 UTC
Created attachment 757587 [details]
libvirt debug log

the device name in log is different from the device name in the bug description.

Comment 3 hongming 2013-06-06 09:55:33 UTC
(In reply to hongming from comment #1)
> Created attachment 757587 [details]
> libvirt debug log
> 
> the device name in log is different from the device name in the bug
> description.

I mean they are two different test.

Comment 4 Laine Stump 2013-07-01 04:04:15 UTC
I have reproduced this crash and posted a fix upstream:

https://www.redhat.com/archives/libvir-list/2013-July/msg00002.html

For reference when testing for this fix - note that it would only crash if a *nonexistent* interface was specified (it wasn't enough to specify an interface that had no SRIOV capabilities; that is yet another failure path that should be in the regression tests to prevent future breakage).

Comment 5 Laine Stump 2013-07-01 04:31:58 UTC
The fix was pushed upstream and will be in libvirt-1.1.0:

commit 2c2525ab6a6f0ad5d75a6c60711e2e28cb1cebe9
Author: Laine Stump <laine>
Date:   Sun Jun 30 23:52:43 2013 -0400

    pci: initialize virtual_functions array pointer to avoid segfault
    
    This fixes https://bugzilla.redhat.com/show_bug.cgi?id=971325
    
    The problem was that if virPCIGetVirtualFunctions was given the name
    of a non-existent interface, it would return to its caller without
    initializing the pointer to the array of virtual functions to NULL,
    and the caller (virNetDevGetVirtualFunctions) would try to VIR_FREE()
    the invalid pointer.
    
    The final error message before the crash would be:
    
     virPCIGetVirtualFunctions:2088 :
      Failed to open dir '/sys/class/net/eth2/device':
      No such file or directory
    
    In this patch I move the initialization in virPCIGetVirtualFunctions()
    to the begining of the function, and also do an explicit
    initialization in virNetDevGetVirtualFunctions, just in case someone
    in the future adds code into that function prior to the call to
    virPCIGetVirtualFunctions.

Comment 6 Jincheng Miao 2013-07-03 01:30:50 UTC
I test this bug on libvirt-1.1.0-1.el7.x86_64, and it can be handled correctly.
So change the status to verified

1. find out the SRIOV PF and a normal interface
# virsh nodedev-list --tree
  ...
  +- pci_0000_00_01_0
  |   |
  |   +- pci_0000_03_00_0
  |   +- pci_0000_03_00_1          <---- PF
  |   |   |
  |   |   +- net_ens1f1_00_1b_21_39_8b_19
  |   |     
  |   +- pci_0000_03_10_1
  |   |   |
  |   |   +- net_enp3s16f1_ce_da_81_95_c3_2e
  |   |     
  |   +- pci_0000_03_10_3
  |   |   |
  |   |   +- net_enp3s16f3_3e_0c_a6_c0_5e_a7
  |   |     
  ...
  |   +- pci_0000_01_00_0           <---- a normal interface
  |       |
  |       +- net_eth0_d8_d3_85_7e_61_9b
  ...

# lspci | grep 82576
03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:10.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:10.7 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:11.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:11.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:11.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)

2. assign a non-existent interface
# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth99'/>   <!---- non-existent interface -->
   </forward>
</network>

#virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml

# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 passthrough          inactive   no            yes


# virsh edit a 
Domain a XML configuration edited.

# virsh dumpxml a
<domain type='kvm'>
  <name>a</name>
  ......

    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start a
error: Failed to start domain a
error: internal error Could not get Virtual functions on eth99

# virsh start a
error: Failed to start domain a
error: internal error Could not get Virtual functions on eth99

3. assign a normal interface (eg. eth0)
# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth99'/>   <!---- normal interface -->
   </forward>
</network>

#virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml

# virsh start a
error: Failed to start domain a
error: internal error No Vf's present on SRIOV PF eth0

# virsh start a
error: Failed to start domain a
error: internal error No Vf's present on SRIOV PF eth0

Comment 7 Xuesong Zhang 2014-03-17 07:49:27 UTC
(In reply to Jincheng Miao from comment #6)
> I test this bug on libvirt-1.1.0-1.el7.x86_64, and it can be handled
> correctly.
> So change the status to verified
> 
> 1. find out the SRIOV PF and a normal interface
> # virsh nodedev-list --tree
>   ...
>   +- pci_0000_00_01_0
>   |   |
>   |   +- pci_0000_03_00_0
>   |   +- pci_0000_03_00_1          <---- PF
>   |   |   |
>   |   |   +- net_ens1f1_00_1b_21_39_8b_19
>   |   |     
>   |   +- pci_0000_03_10_1
>   |   |   |
>   |   |   +- net_enp3s16f1_ce_da_81_95_c3_2e
>   |   |     
>   |   +- pci_0000_03_10_3
>   |   |   |
>   |   |   +- net_enp3s16f3_3e_0c_a6_c0_5e_a7
>   |   |     
>   ...
>   |   +- pci_0000_01_00_0           <---- a normal interface
>   |       |
>   |       +- net_eth0_d8_d3_85_7e_61_9b
>   ...
> 
> # lspci | grep 82576
> 03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> 03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> 03:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:10.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:10.7 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:11.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:11.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 03:11.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev
> 01)
> 
> 2. assign a non-existent interface
> # cat passthrough.xml
> <network>
>    <name>passthrough</name>
>    <forward mode='hostdev' managed='yes'>
>      <pf dev='eth99'/>   <!---- non-existent interface -->
>    </forward>
> </network>
> 
> #virsh net-define passthrough.xml
> Network passthrough defined from passthrough.xml
> 
> # virsh net-list --all
>  Name                 State      Autostart     Persistent
> ----------------------------------------------------------
>  default              active     yes           yes
>  passthrough          inactive   no            yes
> 
> 
> # virsh edit a 
> Domain a XML configuration edited.
> 
> # virsh dumpxml a
> <domain type='kvm'>
>   <name>a</name>
>   ......
> 
>     <interface type='network'>
>       <source network='passthrough'/>
>     </interface>
> 
> # virsh start a
> error: Failed to start domain a
> error: internal error Could not get Virtual functions on eth99
> 
> # virsh start a
> error: Failed to start domain a
> error: internal error Could not get Virtual functions on eth99
> 

Test with the latest build libvirt-1.1.1-27.el7.x86_64, there is a little different here. While specify one non-existent interface in the network, start guest will report following error:

# virsh start a
error: Failed to start domain a
error: internal error: No Vf's present on SRIOV PF eth99

> 3. assign a normal interface (eg. eth0)
> # cat passthrough.xml
> <network>
>    <name>passthrough</name>
>    <forward mode='hostdev' managed='yes'>
>      <pf dev='eth99'/>   <!---- normal interface -->
>    </forward>
> </network>
> 
> #virsh net-define passthrough.xml
> Network passthrough defined from passthrough.xml
> 
> # virsh start a
> error: Failed to start domain a
> error: internal error No Vf's present on SRIOV PF eth0
> 
> # virsh start a
> error: Failed to start domain a
> error: internal error No Vf's present on SRIOV PF eth0

Comment 8 Ludek Smid 2014-06-13 09:28:21 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.