Hide Forgot
Description of problem: Boot up guest with assigned vf, rebind parent pf then chang the number of VFs through sysfs, host kernel output call trace. Version-Release number of selected component (if applicable): host: RHEL6.5-20130905.1 qemu-kvm-0.12.1.2-2.400.el6.x86_64 kernel-2.6.32-417.el6.x86_64 # lspci -v -s 06:00.0 06:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) Subsystem: Intel Corporation Gigabit ET Dual Port Server Adapter Flags: bus master, fast devsel, latency 0, IRQ 38 Memory at dd740000 (32-bit, non-prefetchable) [size=128K] Memory at dd800000 (32-bit, non-prefetchable) [size=4M] I/O ports at ecc0 [size=32] Memory at dd738000 (32-bit, non-prefetchable) [size=16K] Expansion ROM at dd000000 [disabled] [size=4M] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=10 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 90-e2-ba-ff-ff-05-63-5e Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Capabilities: [160] Single Root I/O Virtualization (SR-IOV) Kernel driver in use: igb Kernel modules: igb # lspci -v -s 06:10.0 06:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) Subsystem: Intel Corporation Device a03c Flags: fast devsel [virtual] Memory at dd400000 (64-bit, non-prefetchable) [size=16K] [virtual] Memory at dd420000 (64-bit, non-prefetchable) [size=16K] Capabilities: [70] MSI-X: Enable- Count=3 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: pci-stub Kernel modules: igbvf guest: RHEL6.5-20130905.1 How reproducible: 50% Steps to Reproduce: 1.bring up VFs through sysfs, unbind vf. #echo 2 > /sys/bus/pci/devices/0000\:06\:00.0/sriov_numvfs #echo "8086 10ca" >/sys/bus/pci/drivers/pci-stub/new_id #echo 0000:06:10.0 >/sys/bus/pci/devices/0000\:06\:10.0/driver/unbind #echo 0000:06:10.0 >/sys/bus/pci/drivers/pci-stub/bind 2.boot up guest with this vf. cli: ... -device pci-assign,host=06:10.0,id=vf,romfile=/home/808610ca.rom \ ... 3.rebind it's parent PF #echo "8086 10c9" >/sys/bus/pci/drivers/pci-stub/new_id #echo 0000:06:00.0 >/sys/bus/pci/devices/0000\:06\:00.0/driver/unbind #echo 0000:06:00.0 >/sys/bus/pci/drivers/pci-stub/bind #echo "8086 10c9" >/sys/bus/pci/drivers/igb/new_id #echo 0000:06:00.0 >/sys/bus/pci/drivers/pci-stub/unbind #echo 0000:06:00.0 >/sys/bus/pci/drivers/igb/bind 4. chang the number of VFs through sysfs #echo 0 > /sys/bus/pci/devices/0000\:06\:00.0/sriov_numvfs Actual results: host call trace: kernel: ------------[ cut here ]------------ kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted) kernel: Hardware name: PowerEdge R710 kernel: NETDEV WATCHDOG: p4p1 (igb): transmit queue 6 timed out kernel: Modules linked in: igbvf ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables autofs4 bridge stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm power_meter microcode dcdbas serio_raw lpc_ich mfd_core i7core_edac edac_core be2net igb dca i2c_algo_bit i2c_core ptp pps_core ses enclosure sg bnx2x libcrc32c mdio bnx2 ext4 jbd2 mbcache sr_mod cdrom usb_storage sd_mod crc_t10dif pata_acpi ata_generic ata_piix megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] kernel: Pid: 0, comm: swapper Not tainted 2.6.32-417.el6.x86_64 #1 kernel: Call Trace: kernel: <IRQ> [<ffffffff81071f47>] ? warn_slowpath_common+0x87/0xc0 kernel: [<ffffffff81072036>] ? warn_slowpath_fmt+0x46/0x50 kernel: [<ffffffff8147b8ab>] ? dev_watchdog+0x26b/0x280 kernel: [<ffffffff8105df0e>] ? scheduler_tick+0x11e/0x260 kernel: [<ffffffff8147b640>] ? dev_watchdog+0x0/0x280 kernel: [<ffffffff81084c27>] ? run_timer_softirq+0x197/0x340 kernel: [<ffffffff810aca95>] ? tick_dev_program_event+0x65/0xc0 kernel: [<ffffffff8107aa01>] ? __do_softirq+0xc1/0x1e0 kernel: [<ffffffff810acb6a>] ? tick_program_event+0x2a/0x30 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 kernel: [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 kernel: [<ffffffff8107a8b5>] ? irq_exit+0x85/0x90 kernel: [<ffffffff8153129a>] ? smp_apic_timer_interrupt+0x4a/0x60 kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20 kernel: <EOI> [<ffffffff812e0d0e>] ? intel_idle+0xde/0x170 kernel: [<ffffffff812e0cf1>] ? intel_idle+0xc1/0x170 kernel: [<ffffffff814269c7>] ? cpuidle_idle_call+0xa7/0x140 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 kernel: [<ffffffff81520fe7>] ? start_secondary+0x2ac/0x2ef kernel: ---[ end trace 26d94eabc924252b ]--- kernel: igb 0000:06:00.0: p4p1: Reset adapter kernel: igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX dhclient[2987]: DHCPDISCOVER on p4p1 to 255.255.255.255 port 67 interval 19 (xid=0x178e9a46) kernel: igb 0000:06:00.0: Detected Tx Unit Hang kernel: Tx Queue <6> kernel: TDH <0> kernel: TDT <1> kernel: next_to_use <1> kernel: next_to_clean <0> kernel: buffer_info[next_to_clean] kernel: time_stamp <1000a5342> kernel: next_to_watch <ffff88012a8a8000> kernel: jiffies <1000a5733> kernel: desc.status <558000> kernel: igb 0000:06:00.0: Detected Tx Unit Hang kernel: Tx Queue <6> kernel: TDH <0> kernel: TDT <1> kernel: next_to_use <1> kernel: next_to_clean <0> kernel: buffer_info[next_to_clean] kernel: time_stamp <1000a5342> kernel: next_to_watch <ffff88012a8a8000> kernel: jiffies <1000a5b1b> kernel: desc.status <558000> kernel: igb 0000:06:00.0: Detected Tx Unit Hang kernel: Tx Queue <6> kernel: TDH <0> kernel: TDT <1> kernel: next_to_use <1> kernel: next_to_clean <0> kernel: buffer_info[next_to_clean] kernel: time_stamp <1000a5342> kernel: next_to_watch <ffff88012a8a8000> kernel: jiffies <1000a5f03> kernel: desc.status <558000> kernel: igb 0000:06:00.0: Detected Tx Unit Hang kernel: Tx Queue <6> kernel: TDH <0> kernel: TDT <1> kernel: next_to_use <1> kernel: next_to_clean <0> kernel: buffer_info[next_to_clean] kernel: time_stamp <1000a5342> kernel: next_to_watch <ffff88012a8a8000> kernel: jiffies <1000a62eb> kernel: desc.status <558000> kernel: igb 0000:06:00.0: Detected Tx Unit Hang kernel: Tx Queue <6> kernel: TDH <0> kernel: TDT <1> kernel: next_to_use <1> kernel: next_to_clean <0> kernel: buffer_info[next_to_clean] kernel: time_stamp <1000a5342> kernel: next_to_watch <ffff88012a8a8000> kernel: jiffies <1000a66d3> kernel: desc.status <558000> kernel: igb 0000:06:00.0: p4p1: Reset adapter kernel: igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX Expected results: no call trace Additional info:
This seems to be fixed by bug 985733
Test this scenario on kernel-2.6.32-486.el6.x86_64 with X540-AT2 nic, not hit the problem. [root@dell-per720-02 ~]# lspci -v -s 01:00.0 01:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) Subsystem: Dell Ethernet 10G 4P X540/I350 rNDC Flags: bus master, fast devsel, latency 0, IRQ 109 Memory at d5000000 (64-bit, prefetchable) [size=2M] Memory at d55f8000 (64-bit, prefetchable) [size=16K] Expansion ROM at d8000000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable- Count=64 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [e0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Capabilities: [160] Single Root I/O Virtualization (SR-IOV) Capabilities: [1d0] Access Control Services Kernel driver in use: ixgbe Kernel modules: ixgbe
(In reply to Alex Williamson from comment #3) > This seems to be fixed by bug 985733 Marking this one TestOnly.
Test this bug on kernel-2.6.32-488.el6.x86_64 with 82576 nic, host works well. [root@amd-6168-256-1 pci-stub]# lspci -v -s 23:00.0 23:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) Subsystem: Intel Corporation Gigabit ET Dual Port Server Adapter Flags: bus master, fast devsel, latency 0, IRQ 64 Memory at e53a0000 (32-bit, non-prefetchable) [size=128K] Memory at e4400000 (32-bit, non-prefetchable) [size=4M] I/O ports at ccc0 [size=32] Memory at e53f8000 (32-bit, non-prefetchable) [size=16K] Expansion ROM at e4c00000 [disabled] [size=4M] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=10 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 90-e2-ba-ff-ff-05-63-5e Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Capabilities: [160] Single Root I/O Virtualization (SR-IOV) Kernel driver in use: igb Kernel modules: igb [root@amd-6168-256-1 pci-stub]# lspci -v -s 23:10.0 23:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) Subsystem: Intel Corporation Device a03c Flags: bus master, fast devsel, latency 0 [virtual] Memory at e5000000 (64-bit, non-prefetchable) [size=16K] [virtual] Memory at e5020000 (64-bit, non-prefetchable) [size=16K] Capabilities: [70] MSI-X: Enable+ Count=3 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: pci-stub Kernel modules: igbvf So this bug has been fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1490.html