Bug 957416
| Summary: | Libvirt daemon crash while attaching VF interface or dumping PF interface using nodedev-dumpxml. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Hu Jianwei <jiahu> |
| Component: | libvirt | Assignee: | Laine Stump <laine> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.0 | CC: | acathrow, ajia, bili, cwei, dyuan, honzhang, laine, mzhan |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-1.0.5-1.el7 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-06-13 13:06:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This device should have no PM reset capability, for details, please see following debug information: Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: internal error Unable to reset PCI device 0000:00:07.0: no FLR, PM reset or bus reset available Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: Caught Segmentation violation dumping internal log buffer: <ignore/> Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceConfigOpen:207 : 8086 10ca 0000:03:10.1: opened /sys/bus/pci/devices/0000:03:10.1/config Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceFindCapabilityOffset:395 : 8086 10ca 0000:03:10.1: found cap 0x10 at 0xa0 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceFindCapabilityOffset:402 : 8086 10ca 0000:03:10.1: failed to find cap 0x01 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceDetectFunctionLevelReset:453 : 8086 10ca 0000:03:10.1: detected PCIe FLR capability Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceDetectPowerManagementReset:513 : 8086 10ca 0000:03:10.1: no PM reset capability found Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virFileClose:72 : Closed fd 22 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceConfigOpen:207 : 8086 340e 0000:00:07.0: opened /sys/bus/pci/devices/0000:00:07.0/config Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.533+0000: 6364: debug : virPCIDeviceFindCapabilityOffset:395 : 8086 340e 0000:00:07.0: found cap 0x10 at 0x90 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: debug : virPCIDeviceFindCapabilityOffset:395 : 8086 340e 0000:00:07.0: found cap 0x01 at 0xe0 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: debug : virPCIDeviceFindCapabilityOffset:402 : 8086 340e 0000:00:07.0: failed to find cap 0x13 Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: debug : virPCIDeviceDetectFunctionLevelReset:490 : 8086 340e 0000:00:07.0: no FLR capability found Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: debug : virPCIDeviceDetectPowerManagementReset:513 : 8086 340e 0000:00:07.0: no PM reset capability found Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: error : virPCIDeviceReset:829 : internal error Unable to reset PCI device 0000:00:07.0: no FLR, PM reset or bus reset available Apr 28 03:19:27 dhcp-66-72-126 libvirtd[6362]: 2013-04-28 07:19:27.534+0000: 6364: debug : virFileClose:72 : Closed fd 22 <ignore/> I will try to fix this NULL pointer derefer issue.
(gdb) bt
#0 virPCIGetVirtualFunctionIndex (pf_sysfs_device_link=0x7fc04400f470 "/sys/bus/pci/devices/0000:03:00.1", vf_sysfs_device_link=<optimized out>, vf_index=vf_index@entry=0x7fc06897b8f4)
at util/virpci.c:2107
#1 0x00007fc0785dcacf in virPCIGetVirtualFunctionInfo (vf_sysfs_device_path=<optimized out>, pfname=pfname@entry=0x7fc06897b8f8, vf_index=vf_index@entry=0x7fc06897b8f4) at util/virpci.c:2217
#2 0x00007fc062672ed0 in qemuDomainHostdevNetDevice (hostdev=hostdev@entry=0x7fc05c345048, linkdev=linkdev@entry=0x7fc06897b8f8, vf=vf@entry=0x7fc06897b8f4) at qemu/qemu_hostdev.c:257
#3 0x00007fc0626735a1 in qemuDomainHostdevNetConfigRestore (hostdev=0x7fc05c345048, stateDir=0x7fc05c012250 "/var/run/libvirt/qemu") at qemu/qemu_hostdev.c:400
#4 0x00007fc0626746ba in qemuDomainReAttachHostdevDevices (driver=driver@entry=0x7fc05c104860, name=<optimized out>, hostdevs=0x7fc05c344fa0, nhostdevs=2) at qemu/qemu_hostdev.c:922
#5 0x00007fc0626748c3 in qemuDomainReAttachHostDevices (driver=0x7fc05c104860, def=0x7fc05c332160) at qemu/qemu_hostdev.c:1018
#6 0x00007fc0626842fc in qemuProcessStop (driver=driver@entry=0x7fc05c104860, vm=vm@entry=0x7fc05c335e70, reason=reason@entry=VIR_DOMAIN_SHUTOFF_FAILED, flags=flags@entry=2) at qemu/qemu_process.c:4051
#7 0x00007fc0626860c6 in qemuProcessStart (conn=conn@entry=0x7fc058000ad0, driver=driver@entry=0x7fc05c104860, vm=vm@entry=0x7fc05c335e70, migrateFrom=migrateFrom@entry=0x0, stdin_fd=stdin_fd@entry=-1,
stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=<optimized out>, flags@entry=1) at qemu/qemu_process.c:3859
#8 0x00007fc0626ccae6 in qemuDomainObjStart (conn=0x7fc058000ad0, driver=driver@entry=0x7fc05c104860, vm=vm@entry=0x7fc05c335e70, flags=flags@entry=0) at qemu/qemu_driver.c:5458
#9 0x00007fc0626cd09a in qemuDomainStartWithFlags (dom=0x7fc0440008c0, flags=0) at qemu/qemu_driver.c:5514
#10 0x00007fc07865de07 in virDomainCreate (domain=domain@entry=0x7fc0440008c0) at libvirt.c:8450
#11 0x00007fc0790440d7 in remoteDispatchDomainCreate (server=<optimized out>, msg=<optimized out>, args=<optimized out>, rerr=0x7fc06897cc90, client=0x7fc0796bbcd0) at remote_dispatch.h:1066
#12 remoteDispatchDomainCreateHelper (server=<optimized out>, client=0x7fc0796bbcd0, msg=<optimized out>, rerr=0x7fc06897cc90, args=<optimized out>, ret=<optimized out>) at remote_dispatch.h:1044
#13 0x00007fc0786b1527 in virNetServerProgramDispatchCall (msg=0x7fc0796bae90, client=0x7fc0796bbcd0, server=0x7fc0796b0400, prog=0x7fc0796b66e0) at rpc/virnetserverprogram.c:439
#14 virNetServerProgramDispatch (prog=0x7fc0796b66e0, server=server@entry=0x7fc0796b0400, client=0x7fc0796bbcd0, msg=0x7fc0796bae90) at rpc/virnetserverprogram.c:305
#15 0x00007fc0786ac738 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7fc0796b0400) at rpc/virnetserver.c:162
#16 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7fc0796b0400) at rpc/virnetserver.c:183
#17 0x00007fc0785e5235 in virThreadPoolWorker (opaque=opaque@entry=0x7fc07968af70) at util/virthreadpool.c:144
#18 0x00007fc0785e4cc1 in virThreadHelper (data=<optimized out>) at util/virthreadpthread.c:161
#19 0x00007fc075e98c53 in start_thread () from /lib64/libpthread.so.0
#20 0x00007fc0757beecd in clone () from /lib64/libc.so.6
Patch on upstream: http://www.redhat.com/archives/libvir-list/2013-April/msg01995.html (In reply to comment #4) > Patch on upstream: > http://www.redhat.com/archives/libvir-list/2013-April/msg01995.html That patch is incorrect (reasons in my reply to the email). The problem was already found and solved upstream post-1.0.4. Here is the correct patch: http://libvirt.org/git/?p=libvirt.git;a=commit;h=9579b6bc209b46a0f079b21455b598c817925b48 Can reproduce the bug in libvirt-1.0.4-1.1.el7.x86_64, but can not reproduce in libvirt-1.0.5-1.el7.x86_64.
1. Create a domain with the following VF xml.
<interface type='hostdev' managed='yes'>
<mac address='52:54:00:84:33:71'/>
<source>
<address type='pci' domain='0x0000' bus='0x0a' slot='0x10' function='0x1'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
2. Start the rhel7
[root@SRIOV2 ~]# virsh list --all
Id Name State
----------------------------------------------------
14 rhel7 running
[root@SRIOV2 ~]# virsh dumpxml rhel7
...
<interface type='hostdev' managed='yes'>
<mac address='52:54:00:84:33:71'/>
<source>
<address type='pci' domain='0x0000' bus='0x0a' slot='0x10' function='0x1'/>
</source>
<alias name='hostdev0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
...
3. Dump pci node.
[root@SRIOV2 ~]# virsh nodedev-list --tree
...
+- pci_0000_00_1c_6
| |
| +- pci_0000_07_00_0
| |
| +- pci_0000_08_02_0
| | |
| | +- pci_0000_09_00_0
| | | |
| | | +- net_p1p1_00_1b_21_55_b3_b8
| | |
| | +- pci_0000_09_00_1
| | | |
| | | +- net_eth3_00_1b_21_55_b3_b9
| | |
| | +- pci_0000_0a_10_0
| | | |
| | | +- net_p1p1_0_d2_23_65_b4_70_44
...
[root@SRIOV2 ~]# virsh nodedev-dumpxml pci_0000_09_00_1
<device>
<name>pci_0000_09_00_1</name>
<parent>pci_0000_08_02_0</parent>
<driver>
<name>igb</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>9</bus>
<slot>0</slot>
<function>1</function>
<product id='0x10e8'>82576 Gigabit Network Connection</product>
<vendor id='0x8086'>Intel Corporation</vendor>
<capability type='virt_functions'>
<address domain='0x0000' bus='0x0a' slot='0x10' function='0x1'/>
<address domain='0x0000' bus='0x0a' slot='0x10' function='0x3'/>
</capability>
</capability>
</device>
[root@SRIOV2 ~]# virsh nodedev-dumpxml pci_0000_0a_10_0
<device>
<name>pci_0000_0a_10_0</name>
<parent>pci_0000_08_02_0</parent>
<driver>
<name>igbvf</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>10</bus>
<slot>16</slot>
<function>0</function>
<product id='0x10ca'>82576 Virtual Function</product>
<vendor id='0x8086'>Intel Corporation</vendor>
<capability type='phys_function'>
<address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
</capability>
</capability>
</device>
4. After above commands, to check the libvirtd status.
[root@SRIOV2 ~]# systemctl status libvirtd
libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
Active: active (running) since Mon 2013-05-06 02:01:15 EDT; 1h 33min ago
Main PID: 6917 (libvirtd)
CGroup: name=systemd:/system/libvirtd.service
├─1778 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf
├─6565 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/br1.conf
├─6917 /usr/sbin/libvirtd
└─8461 /usr/libexec/qemu-kvm -name rhel7 -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu qemu64,-kvmclock -bios /usr/share...
May 06 03:29:25 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[6565]: using local addresses only for unqualified names
May 06 03:29:25 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1778]: using local addresses only for unqualified names
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[6565]: reading /etc/resolv.conf
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1778]: reading /etc/resolv.conf
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1778]: using nameserver 10.68.5.26#53
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1778]: using nameserver 10.66.127.17#53
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1778]: using local addresses only for unqualified names
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[6565]: using nameserver 10.68.5.26#53
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[6565]: using nameserver 10.66.127.17#53
May 06 03:29:48 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[6565]: using local addresses only for unqualified names
We can get expected results and the libvirt daemon does not crash, running normally.
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |
Description of problem: Libvirt daemon crash while attaching VF interface or dumping PF interface using nodedev-dumpxml. Version-Release number of selected component (if applicable): libvirt-1.0.4-1.1.el7.x86_64 qemu-kvm-1.4.0-3.el7.x86_64 kernel-3.9.0-0.rc8.54.el7.x86_64 How reproducible: 100% Steps: Setup On SR-IOV1 or SR-IOV2 server: 1.Need the following steps: modprobe -r kvm_intel modprobe -r kvm modprobe kvm allow_unsafe_assigned_interrupts=1 modprobe kvm_intel 2. Generate VFs modprobe -r igb modprobe igb max_vfs=2 3. [root@#localhost ~]# lspci |grep Ethernet 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5764M Gigabit Ethernet PCIe (rev 10) 03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 03:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) 03:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) 03:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) 03:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) Reproduced steps: 1. Create a domain with below VF interface. <interface type='hostdev' managed='yes'> <mac address='52:54:00:e0:2c:31'/> <source> <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> 2. [root@#localhost ~]# virsh list --all Id Name State ---------------------------------------------------- - rhel6_local shut off 3. [root@#localhost ~]# virsh start rhel6_local error: Failed to start domain rhel6_local error: End of file while reading data: Input/output error error: Failed to reconnect to the hypervisor 4. [root@#localhost ~]# virsh list --all error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused 5. [root@SRIOV2 ~]# systemctl status libvirtd libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled) Active: failed (Result: signal) since Sat 2013-04-27 05:12:34 EDT; 5min ago Process: 5738 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=killed, signal=SEGV) CGroup: name=systemd:/system/libvirtd.service └─1200 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf Apr 27 05:12:23 SRIOV2.qe.lab.eng.nay.redhat.com systemd[1]: Started Virtualization daemon. Apr 27 05:12:30 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1200]: read /etc/hosts - 3 addresses Apr 27 05:12:30 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq[1200]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Apr 27 05:12:30 SRIOV2.qe.lab.eng.nay.redhat.com dnsmasq-dhcp[1200]: read /var/lib/libvirt/dnsmasq/default.hostsfile Apr 27 05:12:34 SRIOV2.qe.lab.eng.nay.redhat.com systemd[1]: libvirtd.service: main process exited, code=killed, status=11/SEGV Apr 27 05:12:34 SRIOV2.qe.lab.eng.nay.redhat.com systemd[1]: MESSAGE=Unit libvirtd.service entered failed state. Another reprodued steps: 1. [root@#localhost ~]# virsh nodedev-list --tree ... | | | +- pci_0000_03_00_0 | | | | | +- net_eth1_00_1b_21_39_8b_18 | | | +- pci_0000_03_00_1 | | | | | +- net_eth2_00_1b_21_39_8b_19 | | | +- pci_0000_03_10_0 | +- pci_0000_03_10_1 | +- pci_0000_03_10_2 | | | | | +- net_p1p1_1_8a_e9_09_2b_1a_93 | | | +- pci_0000_03_10_3 | | | +- net_p1p2_1_0a_6f_3b_d0_c0_a7 | +- pci_0000_00_03_0 | | | +- pci_0000_0f_00_0 | ... 2. [root@#localhost ~]# virsh nodedev-dumpxml pci_0000_03_00_0 error: End of file while reading data: Input/output error error: Failed to reconnect to the hypervisor [root@#localhost ~]# Actual results: Libvirt daemon crash, only restarting libvirtd to solved this issue. Expected results: The domain with VF interface should boot up normally, and the libvirt daemon do not crash. The nodedev-dumpxml command can dump the PF information, can not affect the libvirtd. BTW, nodedev-dumpxml can dump xml for VF normally in rhel7.