Bug 690395 - kernel: BUG: soft lockup - CPU#7 stuck for 10s! [ls:14944]
Summary: kernel: BUG: soft lockup - CPU#7 stuck for 10s! [ls:14944]
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.8
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-24 08:22 UTC by dushy2010
Modified: 2013-02-26 15:50 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-26 15:50:12 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description dushy2010 2011-03-24 08:22:51 UTC
Description of problem:

Our customer is facing soft lockups from sometime now. They are running (RHEL 5.1)2.6.18-128 + patches through 2.6.18-194 kernel on x86_64 arch servers.

Here's the stack trace:
Mar 18 17:17:54 cu0login3 kernel: BUG: soft lockup - CPU#7 stuck for 10s! [ls:14944]
Mar 18 17:17:54 cu0login3 kernel: CPU 7:
Mar 18 17:17:54 cu0login3 kernel: Modules linked in: ecount(U) blcr(U) blcr_imports(U) eeprom(U) openafs(PU) i
pmi_devintf(U) ipmi_si(U) ipmi_msghandler(U) fuse(U) mgc(U) lustre(U) lov(U) mdc(U) osc(U) lquota(U) ko2iblnd(
U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) ib_ucm(U) ib_sdp(U) rdma_cm(U) iw_c
m(U) ib_addr(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_mthca(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) i
b_mad(U) ib_core(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ext3(U) jbd(U) dm_mirror(U) dm_log(U) dm_multipath(U)
scsi_dh(U) dm_mod(U) video(U) hwmon(U) backlight(U) sbs(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) acpi_me
mhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) nx_nic(U) shpchp(U) mlx4_core(U) i2c_piix4(U) ehci_hcd(U) ser
io_raw(U) i2c_core(U) ohci_hcd(U) pcspkr(U) nfs(U) nfs_acl(U) fscache(U) lockd(U) sunrpc(U) e1000e(U) bnx2(U)
scsi_transport_fc(U) aacraid(U) sata_nv(U) mptscsih(U) mptbase(U) ata_piix(U) usb_storage(U) sata_svw(U) libat
a(U) cciss(U) sd_mod(U) scsi_mod(U) tg3(U) libphy(U)
Mar 18 17:17:54 cu0login3 kernel: Pid: 14944, comm: ls Tainted: P      2.6.18-128.1.14.el5.8hp.2sp #1
Mar 18 17:17:54 cu0login3 kernel: RIP: 0010:[<ffffffff80065cef>]  [<ffffffff80065cef>] .text.lock.spinlock+0x5
/0x30
Mar 18 17:17:54 cu0login3 kernel: RSP: 0018:ffff8101103fdbc0  EFLAGS: 00000282
Mar 18 17:17:54 cu0login3 kernel: RAX: 0000000000000000 RBX: ffff81016f7bdc48 RCX: 000000000000cdfb
Mar 18 17:17:54 cu0login3 kernel: RDX: ffff810160d3c960 RSI: 000000000000cdfb RDI: ffffffff803da380
Mar 18 17:17:54 cu0login3 kernel: RBP: ffff810213991800 R08: ffff810211e7cc48 R09: 0000000000000282
Mar 18 17:17:54 cu0login3 kernel: R10: 00000000deadbeef R11: 0000000000000088 R12: ffff81016f7bdc48
Mar 18 17:17:54 cu0login3 kernel: R13: ffff8101880cc7c0 R14: ffff8101880cc7c0 R15: 0000000000000000
Mar 18 17:17:54 cu0login3 kernel: FS:  00002b58a737ef40(0000) GS:ffff81023f2f13c0(0000) knlGS:00000000557466c0
Mar 18 17:17:54 cu0login3 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 18 17:17:54 cu0login3 kernel: CR2: 00002b58a6c9d3c0 CR3: 0000000198d24000 CR4: 00000000000006e0
Mar 18 17:17:54 cu0login3 kernel:
Mar 18 17:17:54 cu0login3 kernel: Call Trace:
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8002ecaf>] prune_dcache+0xe0/0x149
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8004d960>] shrink_dcache_parent+0x1c/0xe1
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff800e8188>] d_invalidate+0x36/0xc4
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d40b>] do_lookup+0x1a0/0x1e6
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d52a>] file_read_actor+0x25/0x154
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000a6d3>] __link_path_walk+0xa01/0xf42
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000ef13>] link_path_walk+0x5c/0xe5
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d0ef>] do_path_lookup+0x270/0x2e8
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff80023a64>] __path_lookup_intent_open+0x56/0x97
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8001b3d6>] open_namei+0x73/0x6d5
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff80067c10>] do_page_fault+0x4fe/0x830
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff800278eb>] do_filp_open+0x1c/0x38
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8001a19b>] do_sys_open+0x44/0xbe
Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0


How reproducible:
Not reproduced

Additional info: 
Is there a patch which can solve this?

Comment 1 Jes Sorensen 2013-02-26 15:50:12 UTC
There is no information in this bug about the system configuration, workload,
etc.

If you see something like this again, please contact your Red Hat Technical
Support representative.


Note You need to log in before you can comment on or make changes to this bug.