RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 980339 - libvirtd crashes when starting a guest that uses a hostdev network specifying a nonexistent PF
Summary: libvirtd crashes when starting a guest that uses a hostdev network specifying...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-02 06:28 UTC by Laine Stump
Modified: 2013-11-21 09:04 UTC (History)
7 users (show)

Fixed In Version: libvirt-0.10.2-20.el6
Doc Type: Bug Fix
Doc Text:
Cause: if an incorrect device name was given in the <pf> element of a libvirt network definition, libvirt would crash when a guest attempted to create an interface using that network. Fix: libvirt now validates the pf device name to verify that it exists and that it is an sriov-capable network device. Result: libvirt no longer crashes when a network with an incorrect <pf> is referenced. Instead it logs an appropriate error message and prevents the operation.
Clone Of: 971325
Environment:
Last Closed: 2013-11-21 09:04:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1581 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2013-11-21 01:11:35 UTC

Description Laine Stump 2013-07-02 06:28:06 UTC
This bug also exists in RHEL6.4 and is a trivial backport.

+++ This bug was initially created as a clone of Bug #971325 +++

Description of problem:
libvirtd crash when start a guest with inactive network that wrong pf value in


Version-Release number of selected component (if applicable):
libvirt-1.0.6-1.el7.x86_64
qemu-kvm-1.5.0-2.el7.x86_64
3.9.0-0.55.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
# virsh nodedev-list --tree

computer
  |
......
  +- pci_0000_00_1c_6
  |   |
  |   +- pci_0000_07_00_0
  |       |
  |       +- pci_0000_08_02_0
  |       |   |
  |       |   +- pci_0000_09_00_0
  |       |   |   |
  |       |   |   +- net_p1p1_00_1b_21_55_b3_b8
  |       |   |    
  |       |   +- pci_0000_09_00_1
  |       |   |   |
  |       |   |   +- net_eth3_00_1b_21_55_b3_b9  <======== right device name
  |       |   |    
  |       |   +- pci_0000_0a_10_0
  |       |   |   |
  |       |   |   +- net_p1p1_0_5a_12_12_ed_a5_1b
......

# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth1'/>                            <======== wrong device name 
   </forward>
</network>

# virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml


# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 passthrough          inactive   no            yes


# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>3be5c577-c1bb-4a6d-8641-adda7b2b9b16</uuid>
  <forward mode='hostdev' managed='yes'>
    <pf dev='eth1'/>
  </forward>
</network>


Add passthrough  interface to guest
# virsh edit rhel7 
Domain rhel7 XML configuration edited.
# virsh dumpxml rhel7
<domain type='kvm'>
  <name>rhel7</name>
  ......

    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start rhel7
error: Failed to start domain rhel7
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor



Actual result
libvirtd crash

Expect result
Throw error 
Additional info:

--- Additional comment from hongming on 2013-06-06 05:42:56 EDT ---

the device name in log is different from the device name in the bug description.

--- Additional comment from hongming on 2013-06-06 05:55:33 EDT ---

(In reply to hongming from comment #1)
> Created attachment 757587 [details]
> libvirt debug log
> 
> the device name in log is different from the device name in the bug
> description.

I mean they are two different test.

--- Additional comment from Laine Stump on 2013-07-01 00:04:15 EDT ---

I have reproduced this crash and posted a fix upstream:

https://www.redhat.com/archives/libvir-list/2013-July/msg00002.html

For reference when testing for this fix - note that it would only crash if a *nonexistent* interface was specified (it wasn't enough to specify an interface that had no SRIOV capabilities; that is yet another failure path that should be in the regression tests to prevent future breakage).

--- Additional comment from Laine Stump on 2013-07-01 00:31:58 EDT ---

The fix was pushed upstream and will be in libvirt-1.1.0:

commit 2c2525ab6a6f0ad5d75a6c60711e2e28cb1cebe9
Author: Laine Stump <laine>
Date:   Sun Jun 30 23:52:43 2013 -0400

    pci: initialize virtual_functions array pointer to avoid segfault
    
    This fixes https://bugzilla.redhat.com/show_bug.cgi?id=971325
    
    The problem was that if virPCIGetVirtualFunctions was given the name
    of a non-existent interface, it would return to its caller without
    initializing the pointer to the array of virtual functions to NULL,
    and the caller (virNetDevGetVirtualFunctions) would try to VIR_FREE()
    the invalid pointer.
    
    The final error message before the crash would be:
    
     virPCIGetVirtualFunctions:2088 :
      Failed to open dir '/sys/class/net/eth2/device':
      No such file or directory
    
    In this patch I move the initialization in virPCIGetVirtualFunctions()
    to the begining of the function, and also do an explicit
    initialization in virNetDevGetVirtualFunctions, just in case someone
    in the future adds code into that function prior to the call to
    virPCIGetVirtualFunctions.

Comment 5 Jincheng Miao 2013-07-09 09:39:45 UTC
The patch libvirt-pci-initialize-virtual_functions-array-pointer-to-avoid-segfault.patch is not completed to this bug, it do not set up pciConfigAddr to NULL.

And this bug also exists in libvirt-0.10.2-19.el6, not be verified.

My reproduce step like:
# virsh nodedev-list --tree
computer
  |
...        
  +- pci_0000_00_16_0
  +- pci_0000_00_16_3
  +- pci_0000_00_19_0
  |   |
  |   +- net_eth0_10_60_4b_78_2a_74  <== this is my network interface, named eth0
  |     
  +- pci_0000_00_1a_0
  |   |
...

# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth99'/>                            <======== wrong interface 
   </forward>
</network>

# virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml


# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 passthrough          inactive   no            yes


# virsh edit a
add following into domain a's xml
    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start a
error: Failed to start domain a
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor

Comment 6 Laine Stump 2013-07-09 18:03:30 UTC
It turns out there was an additional problem that had already been silently fixed upstream several months prior to the original bug (Bug 971325) being filed.

commit ac5cb26a32300d03517692cd15a604dd0517fbd6
Author: John Ferlan <jferlan>
Date:   Tue Jan 22 09:15:41 2013 -0500

    virnetdev: Need to initialize 'pciConfigAddr'
    
    It was possible to call VIR_FREE in cleanup prior to initialization

Comment 7 Laine Stump 2013-07-09 18:04:40 UTC
I backported and posted this additional patch to rhvirt-patches.

  http://post-office.corp.redhat.com/archives/rhvirt-patches/2013-July/msg00176.html

Additionally, I tested and it does eliminate this slightly different crash.

Comment 9 Jincheng Miao 2013-07-16 03:37:12 UTC
This bug fix is verified, the verification step like below:

# rpm -q libvirt
libvirt-0.10.2-20.el6.x86_64

# virsh nodedev-list --tree
computer
  |
...        
  +- pci_0000_00_16_0
  +- pci_0000_00_16_3
  +- pci_0000_00_19_0
  |   |
  |   +- net_eth0_10_60_4b_78_2a_74  <== this is my network interface, named eth0
  |     
  +- pci_0000_00_1a_0
  |   |
...

# cat passthrough.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth99'/>                            <======== wrong interface 
   </forward>
</network>

# virsh net-define passthrough.xml
Network passthrough defined from passthrough.xml


# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 passthrough          inactive   no            yes


# virsh edit r6
add following into domain a's xml
    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start r6
error: Failed to start domain r6
error: internal error Could not get Virtual functions on eth99

# service libvirtd status
libvirtd (pid  3408) is running...

So, change the status to VERIFIED.

Comment 10 Jincheng Miao 2013-07-29 08:25:44 UTC
In addition, for the network card that has no SRIOV capability, the verification step of this fix looks like:

# vim network.xml
<network>
   <name>passthrough</name>
   <forward mode='hostdev' managed='yes'>
     <pf dev='eth1'/>                            <======== no SRIOV interface 
   </forward>
</network>

# virsh net-define network.xml 
Network passthrough defined from network.xml

# virsh net-list --all
Name                 State      Autostart     Persistent
--------------------------------------------------
default              active     yes           yes
passthrough          inactive   no            yes

# virsh edit r6m
Domain r6m XML configuration edited.
add following into domain r6m's xml
    <interface type='network'>
      <source network='passthrough'/>
    </interface>

# virsh start r6m
error: Failed to start domain r6m
error: internal error No Vf's present on SRIOV PF eth1

Comment 12 errata-xmlrpc 2013-11-21 09:04:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1581.html


Note You need to log in before you can comment on or make changes to this bug.