Bug 918015 - [abrt] WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
Summary: [abrt] WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 18
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Michael S. Tsirkin
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:f3f8c0f70bdfc866423b6473786...
: 922455 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-05 10:11 UTC by Marcel Wysocki
Modified: 2014-09-29 13:51 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-27 16:19:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: dmesg (94.62 KB, text/plain)
2013-03-05 10:11 UTC, Marcel Wysocki
no flags Details

Description Marcel Wysocki 2013-03-05 10:11:06 UTC
Additional info:
WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
Hardware name: Latitude E6320
Modules linked in: xt_REDIRECT xt_hl nls_utf8 fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE lockd nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i ip6table_filter cxgb4 ip6_tables cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_idt uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev snd_hda_intel btusb bluetooth snd_hda_codec arc4 media iwldvm mac80211 snd_hwdep iwlwifi ses enclosure snd_seq snd_seq_device cfg80211 snd_pcm snd_page_alloc snd_timer snd soundcore rfkill iTCO_wdt coretemp iTCO_vendor_support lpc_ich mfd_core mei i2c_i801 ppdev microcode parport_pc dell_laptop dcdbas dell_wmi sparse_keymap parport vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc crc32c_intel i915 ghash_clmulni_intel sdhci_pci sdhci mmc_core i2c_algo_bit drm_kms_helper drm e1000e i2c_core wmi video usb_storage sunrpc
Pid: 2141, comm: vhost-2137 Not tainted 3.8.1-201.fc18.x86_64 #1
Call Trace:
 [<ffffffff8105e61f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8105e67a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa01ef0b4>] handle_tx+0x5a4/0x680 [vhost_net]
 [<ffffffffa01ef1c5>] handle_tx_kick+0x15/0x20 [vhost_net]
 [<ffffffffa01eb95d>] vhost_worker+0xed/0x190 [vhost_net]
 [<ffffffffa01eb870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
 [<ffffffff81081f50>] kthread+0xc0/0xd0
 [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0xa0/0xf0
 [<ffffffff81081e90>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff81657b6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81081e90>] ? kthread_create_on_node+0x120/0x120

Comment 1 Marcel Wysocki 2013-03-05 10:11:09 UTC
Created attachment 705350 [details]
File: dmesg

Comment 2 Luis Flores 2013-03-07 15:04:39 UTC
I'm having the same problem:
[95714.520436] ------------[ cut here ]------------
[95714.553056] WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
[95714.586210] Hardware name: System x3250 M3 -[4252EAG]-
[95714.619277] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle bridge stp llc be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfat fat binfmt_misc iTCO_wdt tpm_tis iTCO_vendor_support i7core_edac tpm coretemp edac_core lpc_ich vhost_net tpm_bios shpchp i2c_i801 mfd_core tun macvtap macvlan e1000e serio_raw microcode kvm_intel kvm raid1 mgag200 i2c_algo_bit drm_kms_helper ttm crc32c_intel lpfc drm mptsas i2c_core mptscsih mptbase scsi_transport_fc scsi_transport_sas scsi_tgt
[95714.767621] Pid: 2686, comm: vhost-2685 Tainted: G        W    3.8.1-201.fc18.x86_64 #1
[95714.806420] Call Trace:
[95714.844560]  [<ffffffff8105e61f>] warn_slowpath_common+0x7f/0xc0
[95714.883071]  [<ffffffff8105e67a>] warn_slowpath_null+0x1a/0x20
[95714.922294]  [<ffffffffa02230b4>] handle_tx+0x5a4/0x680 [vhost_net]
[95714.963985]  [<ffffffffa02231c5>] handle_tx_kick+0x15/0x20 [vhost_net]
[95715.002260]  [<ffffffffa021f95d>] vhost_worker+0xed/0x190 [vhost_net]
[95715.040453]  [<ffffffffa021f870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
[95715.078751]  [<ffffffff81081f50>] kthread+0xc0/0xd0
[95715.116451]  [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0xa0/0xf0
[95715.153562]  [<ffffffff81081e90>] ? kthread_create_on_node+0x120/0x120
[95715.189566]  [<ffffffff81657b6c>] ret_from_fork+0x7c/0xb0
[95715.224178]  [<ffffffff81081e90>] ? kthread_create_on_node+0x120/0x120
[95715.257835] ---[ end trace f1acb04ccddca217 ]---
[95718.680500] ------------[ cut here ]------------

The machine is running kvm with 6 vms on fedora 18 64 bits.

Comment 3 David Carlson 2013-04-08 18:57:09 UTC
I have this problem too, but I think I found the culprit:

before /sys/module/vhost_net/parameters/experimental_zcopytx=1


added /etc/modprobe.d/vhost_net.conf, containing:

options vhost_net experimental_zcopytx=0


My guest (Windows 7) would freeze at guest shutdown and the qemu process would become a zombie - so far, so good

Comment 4 Josh Boyer 2013-04-11 19:24:57 UTC
*** Bug 922455 has been marked as a duplicate of this bug. ***

Comment 5 g. artim 2013-04-11 19:46:33 UTC
/sys/module/vhost_net/parameters/experimental_zcopytx=1

dont have the above in /sys but do have 


/usr/lib/modules/3.8.4-202.fc18.x86_64/kernel/drivers/vhost/vhost_net.ko

and not loaded, verified w/ lsmod

3.8.4-202.fc18.x86_64 

(marked as dup, more of the same)
https://bugzilla.redhat.com/show_bug.cgi?id=922455

Comment 6 g. artim 2013-04-15 15:31:31 UTC
just more info, know its a dup...g.


berkeley.edu 3.8.6-203.fc18.x86_64 ( new kernel same problem )

[51906.045046] ------------[ cut here ]------------
[51906.049574] WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
[51906.054172] Hardware name: H8QM8
[51906.058699] Modules linked in: binfmt_misc ebtable_nat ebtables nfsv4 auth_rpcgss nfs dns_resolver fscache lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 nf_conntrack_ipv4 nf_defrag_ipv4 cxgb3i cxgb3 xt_conntrack nf_conntrack ip6table_filter mdio libcxgbi ib_iser ip6_tables rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev nv_tco igb amd64_edac_mod ptp e1000 parport_pc pps_core edac_core edac_mce_amd shpchp dca parport serio_raw i2c_nforce2 k10temp microcode vhost_net tun macvtap macvlan kvm_amd kvm radeon i2c_algo_bit drm_kms_helper ttm ata_generic pata_acpi drm i2c_core sata_nv pata_amd
[51906.087094] Pid: 1017, comm: vhost-1016 Tainted: G        W    3.8.6-203.fc18.x86_64 #1
[51906.091703] Call Trace:
[51906.096218]  [<ffffffff8105e62f>] warn_slowpath_common+0x7f/0xc0
[51906.100766]  [<ffffffff8105e68a>] warn_slowpath_null+0x1a/0x20
[51906.105279]  [<ffffffffa01b00b4>] handle_tx+0x5a4/0x680 [vhost_net]
[51906.109825]  [<ffffffffa01b01c5>] handle_tx_kick+0x15/0x20 [vhost_net]
[51906.114369]  [<ffffffffa01ac95d>] vhost_worker+0xed/0x190 [vhost_net]
[51906.118919]  [<ffffffffa01ac870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
[51906.123482]  [<ffffffff81081fe0>] kthread+0xc0/0xd0
[51906.128012]  [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_entry+0xa0/0xf0
[51906.132577]  [<ffffffff81081f20>] ? kthread_create_on_node+0x120/0x120
[51906.137128]  [<ffffffff8165922c>] ret_from_fork+0x7c/0xb0
[51906.141655]  [<ffffffff81081f20>] ? kthread_create_on_node+0x120/0x120
[51906.146202] ---[ end trace f7e3ecd35d354772 ]---

Comment 7 Chris Murphy 2013-05-15 05:13:31 UTC
Description of problem:
Very unsure what triggered this, but a qemu VM was running and I was changing hostnames with hostnamectl in both the VM and the host at the same time and then the remote ssh sessions for the VM and host became unresponsive. Logs indicate the oops happened well before the unresponsiveness however.

Version-Release number of selected component:
kernel

Additional info:
cmdline:        BOOT_IMAGE=/vmlinuz-3.9.2-200.fc18.x86_64 root=UUID=4622d24d-b9cf-4f63-a343-e2a6abb561c6 ro rootflags=subvol=root rd.md=0 rd.lvm=0 rd.dm=0 rd.luks=0 vconsole.keymap=us rhgb quiet LANG=en_US.UTF-8
kernel:         3.9.2-200.fc18.x86_64
type:           Kerneloops
ureports_counter: 1

Truncated backtrace:
WARNING: at include/linux/kref.h:42 handle_tx+0x5a4/0x680 [vhost_net]()
Hardware name: MacBookPro4,1
Modules linked in: fuse ebtable_nat xt_CHECKSUM bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables rfcomm bnep be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nls_utf8 hfsplus arc4 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core snd_hda_codec_realtek acpi_cpufreq b43 mperf videodev bcma snd_hda_intel snd_hda_codec mac80211 media snd_hwdep btusb coretemp cfg80211 snd_seq iTCO_wdt iTCO_vendor_support applesmc snd_seq_device ssb bcm5974 snd_pcm snd_page_alloc i2c_i801 lpc_ich bluetooth rfkill snd_timer snd soundcore mfd_core sky2 mmc_core input_polldev microcode apple_bl vhost_net tun macvtap macvlan kvm_intel kvm uinput btrfs zlib_deflate raid6_pq libcrc32c xor nouveau firewire_ohci mxm_wmi wmi firewire_core i2c_algo_bit crc_itu_t drm_kms_helper ttm drm i2c_core video
Pid: 2406, comm: vhost-2403 Not tainted 3.9.2-200.fc18.x86_64 #1
Call Trace:
 [<ffffffff8105f105>] warn_slowpath_common+0x75/0xa0
 [<ffffffff8105f14a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa03540a4>] handle_tx+0x5a4/0x680 [vhost_net]
 [<ffffffffa03541b5>] handle_tx_kick+0x15/0x20 [vhost_net]
 [<ffffffffa035095d>] vhost_worker+0xed/0x190 [vhost_net]
 [<ffffffffa0350870>] ? vhost_work_flush+0x110/0x110 [vhost_net]
 [<ffffffff81082c40>] kthread+0xc0/0xd0
 [<ffffffff81010000>] ? ftrace_define_fields_xen_mc_flush+0x40/0xb0
 [<ffffffff81082b80>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff816699ac>] ret_from_fork+0x7c/0xb0
 [<ffffffff81082b80>] ? kthread_create_on_node+0x120/0x120

Comment 8 Chris Murphy 2013-05-15 05:50:27 UTC
I can reproduce this oops with the total implosion of guest and host:
1. Host is F18 with all updates, and kernel 3.9.2-200.
2. Guest is F19 beta TC4 with all updates and kernel 3.9.1 or 3.9.0.
3. Virtual Machine Manager networking has been changed from NAT to bridging. If I go back to NAT the oops doesn't happen.
4. From a 2nd computer, ssh into the F18 host.
5. From the same 2nd computer, different terminal window, ssh into the F19 guest. And then type almost anything, dmesg will do it.

I get a few lines of return, then kaboom, both guest and host are gone.

Comment 9 David Carlson 2013-05-15 06:08:11 UTC
Definitely try the workaround.  It has been running flawlessly for me for a month and a half with the bridge driver on guests.

The core issue (note in the backtraces the frame before the warning is always a transmit function) is a crash in "experimental" transmit zero-copy inside the vhost_net driver, which setting the option disables.  The option should be opt-in instead of opt-out if it doesn't work in all cases.

as root, create /etc/modprobe.d/vhost_net.conf

options vhost_net experimental_zcopytx=0

It has to be in effect at module load time so either shut down guests and modprobe  or reboot the host.

-Dave

Comment 10 Justin M. Forbes 2013-10-18 21:10:21 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 18 kernel bugs.

Fedora 18 has now been rebased to 3.11.4-101.fc18.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 19, and are still experiencing this issue, please change the version to Fedora 19.

If you experience different issues, please open a new bug report for those.

Comment 11 Justin M. Forbes 2013-11-27 16:19:20 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  

It has been over a month since we asked you to test the 3.11 kernel updates and let us know if your issue has been resolved or is still a problem. When this happened, the bug was set to needinfo.  Because the needinfo is still set, we assume either this is no longer a problem, or you cannot provide additional information to help us resolve the issue.  As a result we are closing with insufficient data. If this is still a problem, we apologize, feel free to reopen the bug and provide more information so that we can work towards a resolution

If you experience different issues, please open a new bug report for those.


Note You need to log in before you can comment on or make changes to this bug.