Bug 923840

Summary: BUG: soft lockup - CPU#6 stuck for 22s! [sh:1951] when connecting as ovirt node
Product: [Fedora] Fedora Reporter: Ohad Basan <obasan>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 18CC: crobinso, eedri, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mgoldboi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-04-02 13:39:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ohad Basan 2013-03-20 14:59:36 UTC
Description of problem:
I have an f18 machine that is connected to an ovirt engine 3.2 stable
it is fully updated
[root@cinteg02 log]# uname -a
Linux cinteg02.ci.lab.tlv.redhat.com 3.8.3-201.fc18.x86_64 #1 SMP Thu Mar 14 21:28:05 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
it is running vdsm+nested kvm hook
every few minutes the following error appears on the screen:
Message from syslogd@cinteg02 at Mar 20 16:57:32 ...
 kernel:[  627.323524] BUG: soft lockup - CPU#6 stuck for 23s! [sh:1951]

the serever gets hiccupsand the server freezes and stop responding.

Comment 1 Ohad Basan 2013-03-20 15:00:36 UTC
snippet from dmesg:

  767.108800] BUG: soft lockup - CPU#6 stuck for 22s! [sh:1951]
[  767.109421] Modules linked in: nfsv3 nfs_acl nfs lockd sunrpc dns_resolver fscache 8021q garp bonding ip6table_filter ip6_tables ebtable_nat ebtables softdog bridge stp llc be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_service_time xt_physdev nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack dm_multipath joydev iTCO_wdt iTCO_vendor_support igb i7core_edac i2c_i801 ptp pps_core edac_core ioatdma lpc_ich dca mfd_core vhost_net tun macvtap macvlan coretemp dcdbas binfmt_misc crc32c_intel ghash_clmulni_intel kvm_intel microcode kvm ast i2c_algo_bit drm_kms_helper ttm drm i2c_core
[  767.109461] CPU 6 
[  767.109463] Pid: 1951, comm: sh Not tainted 3.8.3-201.fc18.x86_64 #1 Dell       C6100           /0D61XP
[  767.109465] RIP: 0010:[<ffffffff8106e21a>]  [<ffffffff8106e21a>] lock_timer_base.isra.37+0x2a/0x70
[  767.109471] RSP: 0018:ffff8803166bdcd8  EFLAGS: 00000246
[  767.109473] RAX: ffff8803166bdfd8 RBX: ffff88033ffedb00 RCX: 0000000000000080
[  767.109474] RDX: 0000000000000000 RSI: ffff8803166bdd10 RDI: ffff880316984ce8
[  767.109475] RBP: ffff8803166bdcf8 R08: ffff8803332d6b80 R09: 00000000ffffffff
[  767.109476] R10: 0000000000016b80 R11: ffffffff812ebe42 R12: 0000000000000000
[  767.109478] R13: ffff8803166bdc88 R14: ffff88033ffedb08 R15: 0000000000000000
[  767.109479] FS:  00007f1e51b24740(0000) GS:ffff8803332c0000(0000) knlGS:0000000000000000
[  767.109480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  767.109482] CR2: 00007f1e5119f100 CR3: 000000032c914000 CR4: 00000000000007e0
[  767.109483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  767.109484] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  767.109486] Process sh (pid: 1951, threadinfo ffff8803166bc000, task ffff880316729760)
[  767.109486] Stack:
[  767.109487]  ffff880316984cd0 ffff880316984c00 ffff88032a763e80 ffff880316984d20
[  767.109490]  ffff8803166bdd28 ffffffff8106eb20 ffff88033ffecd80 ffff880300000000
[  767.109492]  ffff880316984cd0 ffff880316984c00 ffff8803166bdd48 ffffffff8106ebca
[  767.109494] Call Trace:
[  767.109497]  [<ffffffff8106eb20>] try_to_del_timer_sync+0x20/0x70
[  767.109499]  [<ffffffff8106ebca>] del_timer_sync+0x5a/0x70
[  767.109503]  [<ffffffff812f1126>] cfq_exit_queue+0x36/0xf0
[  767.109506]  [<ffffffff812cfe58>] elevator_exit+0x38/0x60
[  767.109508]  [<ffffffff812d0718>] elevator_change+0x148/0x230
[  767.109511]  [<ffffffff812d131b>] elv_iosched_store+0x2b/0x60
[  767.109514]  [<ffffffff812d94d4>] queue_attr_store+0x64/0xc0
[  767.109517]  [<ffffffff81211108>] sysfs_write_file+0xd8/0x150
[  767.109521]  [<ffffffff8119d9ac>] vfs_write+0xac/0x180
[  767.109523]  [<ffffffff8119dcf2>] sys_write+0x52/0xa0
[  767.109527]  [<ffffffff8165408e>] ? do_page_fault+0xe/0x10
[  767.109530]  [<ffffffff81658699>] system_call_fastpath+0x16/0x1b
[  767.109531] Code: 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 4c 89 6d f0 4c 89 75 f8 49 89 fd 48 89 5d e0 4c 89 65 e8 49 89 f6 49 8b 5d 00 49 89 dc <49> 83 e4 fc 74 31 4c 89 e7 e8 78 1e 5e 00 49 89 06 49 3b 5d 00

Comment 2 Cole Robinson 2013-04-02 13:39:41 UTC

*** This bug has been marked as a duplicate of bug 902012 ***