Bug 1018208

Summary: spinlock from lock_sock_nested() / tcp_recvmsg(): BUG: soft lockup - CPU#0 stuck for 22s! [Chrome_IOThread:7007]
Product: [Fedora] Fedora Reporter: David Howells <dhowells>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 19CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, marcelo.barbosa
Target Milestone: ---Flags: jforbes: needinfo?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-10 10:40:57 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
dmesg from boot to crash
none
Partial SysRq+t dump none

Description David Howells 2013-10-11 09:02:22 EDT
Created attachment 811091 [details]
dmesg from boot to crash

Description of problem:

BUG: soft lockup - CPU#0 stuck for 22s! [Chrome_IOThread:7007]
Modules linked in: udp_diag tcp_diag inet_diag tun fuse snd_pcm_oss snd_mixer_oss bridge stp llc rfcomm bnep nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables it87 hwmon_vid dm_crypt mb86a16 nouveau snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal coretemp ppdev kvm_intel mxm_wmi kvm crc32_pclmul wmi crc32c_intel ghash_clmulni_intel i2c_algo_bit ttm microcode drm_kms_helper serio_raw cdc_acm ftdi_sio btusb bluetooth rfkill snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device drm mantis mantis_core i2c_i801 dvb_core i2c_core snd_pcm r8169 mei_me mii snd_page_alloc snd_timer
 mei lpc_ich snd mfd_core soundcore shpchp parport_pc parport acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc binfmt_misc uinput raid1 video ecryptfs encrypted_keys trusted tpm tpm_bios [last unloaded: ipmi_msghandler]
irq event stamp: 5301873
hardirqs last  enabled at (5301872): [<ffffffff817325b3>] restore_args+0x0/0x30
hardirqs last disabled at (5301873): [<ffffffff8173c6ed>] apic_timer_interrupt+0x6d/0x80
softirqs last  enabled at (5250126): [<ffffffff8107b461>] __do_softirq+0x1a1/0x410
softirqs last disabled at (5250316): [<ffffffff817315b8>] _raw_spin_lock_bh+0x18/0x80
CPU: 0 PID: 7007 Comm: Chrome_IOThread Not tainted 3.11.3-201.fc19.x86_64.debug #1
Hardware name: Gigabyte Technology Co., Ltd. H87-HD3/H87-HD3, BIOS F3 05/09/2013
task: ffff8803aa3624b0 ti: ffff88027571c000 task.ti: ffff88027571c000
RIP: 0010:[<ffffffff8137b680>]  [<ffffffff8137b680>] delay_loop+0x40/0x40
RSP: 0018:ffff88027571dbe0  EFLAGS: 00000206
RAX: ffff88027571dfd8 RBX: ffffffffffffff10 RCX: 0000000000007f20
RDX: 0000000000002625 RSI: ffff88041ce00000 RDI: 0000000000000001
RBP: ffff88027571dc00 R08: ffff8803aa362bb0 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88041ce00000
R13: 000000000000c235 R14: 0000000000007f20 R15: 0000000067511b5f
FS:  00007f57000e4700(0000) GS:ffff88041ce00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f56cb580440 CR3: 000000026aa3e000 CR4: 00000000001407f0
Stack:
 ffffffff81383f51 ffff8802a33bbc50 ffff8802a33bbc68 0000000000000000
 ffff88027571dc28 ffffffff8173160b ffffffff815d7eac ffff8802a33bbc00
 ffff8802a33bbc50 ffff88027571dc58 ffffffff815d7eac ffff8803aa362bb0
Call Trace:
 [<ffffffff81383f51>] ? do_raw_spin_lock+0xe1/0x130
 [<ffffffff8173160b>] _raw_spin_lock_bh+0x6b/0x80
 [<ffffffff815d7eac>] ? lock_sock_nested+0x3c/0xa0
 [<ffffffff815d7eac>] lock_sock_nested+0x3c/0xa0
 [<ffffffff81647189>] tcp_recvmsg+0x89/0xf90
 [<ffffffff810b76c5>] ? sched_clock_cpu+0xb5/0x100
 [<ffffffff810e5bbd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810b77ff>] ? local_clock+0x5f/0x70
 [<ffffffff81677309>] inet_recvmsg+0x129/0x220
 [<ffffffff815d47f6>] sock_aio_read.part.8+0x116/0x130
 [<ffffffff81303ff9>] ? avc_has_perm_flags+0x29/0x350
 [<ffffffff815d4831>] sock_aio_read+0x21/0x30
 [<ffffffff811f17d0>] do_sync_read+0x80/0xb0
 [<ffffffff811f1eed>] vfs_read+0x14d/0x170
 [<ffffffff811f29ac>] SyS_read+0x4c/0xa0
 [<ffffffff8173ba59>] system_call_fastpath+0x16/0x1b
Code: 66 66 2e 0f 1f 84 00 00 00 00 00 eb 0e 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 ff c8 75 fb 48 ff c8 5d c3 66 0f 1f 44 00 00 <0f> 1f 44 00 00 55 48 89 e5 ff 15 89 97 92 00 5d c3 66 66 66 66

Version-Release number of selected component (if applicable):

kernel-debug-3.11.3-201.fc19.x86_64

How reproducible:

Seems fairly regular.  The desktop gets slower and slower until this happens (assuming the desktop doesn't lock up entirely and have to be rebooted).  Previously, before I was running a debug kernel, I'd maybe see just a CPU lockup message after the desktop stopped responding for a while.

I moved my drives to a new motherboard a few days ago (this one has a Realtek NIC rather than Intel NICs) and upgraded the kernel to the latest.  I didn't see this problem prior to that.

Additional info:

00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 04)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
00:16.3 Serial controller: Intel Corporation 8 Series/C220 Series Chipset Family KT Controller (rev 04)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d4)
00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d4)
00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d4)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation H87 Express LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GF106 [GeForce GTS 450] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GF106 High Definition Audio Controller (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
04:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41)
Comment 1 David Howells 2013-10-11 09:04:38 EDT
Created attachment 811094 [details]
Partial SysRq+t dump

I managed to get a partial SysRq+t dump.  Unfortunately, it seems the kernel printk buffer is not big enough to capture the whole thing.
Comment 2 David Howells 2013-10-11 09:10:48 EDT
This may relate to:

    https://retrace.fedoraproject.org/faf/reports/97576/

and:

    bug 863671
Comment 3 David Howells 2013-10-12 08:55:37 EDT
This doesn't seem to happen with kernel-3.9.5-301.fc19.x86_64.
Comment 4 Justin M. Forbes 2014-01-03 17:04:52 EST
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.
Comment 5 Justin M. Forbes 2014-03-10 10:40:57 EDT
*********** MASS BUG UPDATE **************

This bug has been in a needinfo state for more than 1 month and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 19, please feel free to reopen the bug and provide the additional information requested.