Bug 985671

Summary: Kernel Crash in networking code after shutting down Windows VM
Product: [Fedora] Fedora Reporter: galens
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: galens, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-08 16:52:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description galens 2013-07-18 04:30:33 UTC
Description of problem:
After shutting down or rebooting my Windows VM (qemu-kvm, virt-manager, Windows XP 64bit Pro SP2) my host OS (fedora 19, 3.9.9-302.fc19.x86_64) crashes within an hour and usually within minutes.

Stack trace and the immediately prior lines from /var/log/message are below.


Version-Release number of selected component (if applicable):
kernel 3.9.9-302.fc19.x86_64
qemu-kvm-1.4.2-4.fc19
virt-manager 0.10.0-1.fc19
libvirt 1.0.5.2-1.fc19

Not using NetworkManager
Not using Paravirtualized drivers


How reproducible:
100% consistent since upgrading to 3.9.9-302; shutdown or restart windows VM, system will crash.


Steps to Reproduce:
1. Boot Windows VM.
2. Shut down windows VM from within the VM (command line, start menu button, etc.)
3. Wait.

Note: This behavior does not happen if I use virt-manager or virsh to force the system off, rather than restart or shutdown from within the VM.

I have used the system for hours after a force-off and reboot of the VM without a crash.  However, if I then subsequently shut the VM down cleanly, the linux box will crash.

Actual results:
See stack trace below

Expected results:
No kernel oops.

Additional info:
I no longer believe this crash is related to 980254, although I was experiencing that crash using earlier 3.9.x kernels (including 3.9.9-201).
	


messages immediately prior to the crash:
(@ time 86425.520070)
device vnet0 entered promiscuous mode
br0: port 2(vnet0) entered forwarding state
br0: port 2(vnet0) entered forwarding state
qemu-system-x86: sending ioctl 5326 to a partition!
qemu-system-x86: sending ioctl 80200204 to a partition!
[avahi-daemon] Registering new address record for fe80::fc54:ff:fec0:3ed2 on vnet0.*.


My crash from this afternoon:
(@ time 86493.252165; that is, 66 seconds later)
WARNING: at lib/list_debug.c:33 __list_add+0xac/0xc0()
Hardware name:         
list_add corruption. prev->next should be next (ffff880129b75568), but was ffffffff8116bf70. (prev=ffff88011fd37f40).
Modules linked in: tun ebtable_nat nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE bridge stp llc ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables dm_service_time snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm r8169 iTCO_wdt iTCO_vendor_support acpi_cpufreq mperf coretemp kvm_intel kvm snd_page_alloc snd_timer snd soundcore usblp lpc_ich mfd_core microcode i2c_i801 usb_storage mii binfmt_misc dm_multipath nouveau mxm_wmi wmi i2c_algo_bit drm_kms_helper ttm drm i2c_core video
Pid: 5300, comm: Socket Thread Not tainted 3.9.9-302.fc19.x86_64 #1
Call Trace:
 [<ffffffff81306d00>] ? __list_add+0x30/0xc0
 [<ffffffff8105cc56>] warn_slowpath_common+0x66/0x80
 [<ffffffff8105ccbc>] warn_slowpath_fmt+0x4c/0x50
 [<ffffffff8116bf70>] ? end_swap_bio_read+0x70/0x70
 [<ffffffff81306d7c>] __list_add+0xac/0xc0
 [<ffffffff8106bed3>] __internal_add_timer+0x113/0x130
 [<ffffffff8106c527>] internal_add_timer+0x17/0x40
 [<ffffffff8106d812>] mod_timer+0x102/0x210
 [<ffffffff8106d938>] add_timer+0x18/0x20
 [<ffffffffa033be0b>] __nf_conntrack_confirm+0x2cb/0x460 [nf_conntrack]
 [<ffffffffa0356278>] ipv4_confirm+0xc8/0x110 [nf_conntrack_ipv4]
 [<ffffffff8156880b>] nf_iterate+0x8b/0xa0
 [<ffffffff815757f0>] ? ip_fragment+0x880/0x880
 [<ffffffff81568894>] nf_hook_slow+0x74/0x130
 [<ffffffff815757f0>] ? ip_fragment+0x880/0x880
 [<ffffffff81576ef2>] ip_output+0x82/0x90
 [<ffffffff81576635>] ip_local_out+0x25/0x30
 [<ffffffff8157698a>] ip_queue_xmit+0x14a/0x3f0
 [<ffffffff8158f2cd>] tcp_transmit_skb+0x44d/0x970
 [<ffffffff81591d36>] tcp_connect+0x4e6/0x5c0
 [<ffffffff810af07e>] ? getnstimeofday+0xe/0x30
 [<ffffffff810af106>] ? ktime_get_real+0x16/0x50
 [<ffffffff815953f1>] tcp_v4_connect+0x321/0x480
 [<ffffffff815a8aa5>] __inet_stream_connect+0xa5/0x320
 [<ffffffff8157c83e>] ? inet_csk_init_xmit_timers+0x6e/0xa0
 [<ffffffff815a8d58>] inet_stream_connect+0x38/0x50
 [<ffffffff81525157>] sys_connect+0xe7/0x120
 [<ffffffff810dcb36>] ? __audit_syscall_exit+0x1f6/0x2a0
 [<ffffffff8164f319>] system_call_fastpath+0x16/0x1b

Comment 1 Josh Boyer 2013-09-18 20:55:29 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 2 Josh Boyer 2013-10-08 16:52:36 UTC
This was fixed with an update some time ago.