Hide Forgot
Description of problem: Our customer is facing soft lockups from sometime now. They are running (RHEL 5.1)2.6.18-128 + patches through 2.6.18-194 kernel on x86_64 arch servers. Here's the stack trace: Mar 18 17:17:54 cu0login3 kernel: BUG: soft lockup - CPU#7 stuck for 10s! [ls:14944] Mar 18 17:17:54 cu0login3 kernel: CPU 7: Mar 18 17:17:54 cu0login3 kernel: Modules linked in: ecount(U) blcr(U) blcr_imports(U) eeprom(U) openafs(PU) i pmi_devintf(U) ipmi_si(U) ipmi_msghandler(U) fuse(U) mgc(U) lustre(U) lov(U) mdc(U) osc(U) lquota(U) ko2iblnd( U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) ib_ucm(U) ib_sdp(U) rdma_cm(U) iw_c m(U) ib_addr(U) ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_mthca(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) i b_mad(U) ib_core(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ext3(U) jbd(U) dm_mirror(U) dm_log(U) dm_multipath(U) scsi_dh(U) dm_mod(U) video(U) hwmon(U) backlight(U) sbs(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) acpi_me mhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) nx_nic(U) shpchp(U) mlx4_core(U) i2c_piix4(U) ehci_hcd(U) ser io_raw(U) i2c_core(U) ohci_hcd(U) pcspkr(U) nfs(U) nfs_acl(U) fscache(U) lockd(U) sunrpc(U) e1000e(U) bnx2(U) scsi_transport_fc(U) aacraid(U) sata_nv(U) mptscsih(U) mptbase(U) ata_piix(U) usb_storage(U) sata_svw(U) libat a(U) cciss(U) sd_mod(U) scsi_mod(U) tg3(U) libphy(U) Mar 18 17:17:54 cu0login3 kernel: Pid: 14944, comm: ls Tainted: P 2.6.18-128.1.14.el5.8hp.2sp #1 Mar 18 17:17:54 cu0login3 kernel: RIP: 0010:[<ffffffff80065cef>] [<ffffffff80065cef>] .text.lock.spinlock+0x5 /0x30 Mar 18 17:17:54 cu0login3 kernel: RSP: 0018:ffff8101103fdbc0 EFLAGS: 00000282 Mar 18 17:17:54 cu0login3 kernel: RAX: 0000000000000000 RBX: ffff81016f7bdc48 RCX: 000000000000cdfb Mar 18 17:17:54 cu0login3 kernel: RDX: ffff810160d3c960 RSI: 000000000000cdfb RDI: ffffffff803da380 Mar 18 17:17:54 cu0login3 kernel: RBP: ffff810213991800 R08: ffff810211e7cc48 R09: 0000000000000282 Mar 18 17:17:54 cu0login3 kernel: R10: 00000000deadbeef R11: 0000000000000088 R12: ffff81016f7bdc48 Mar 18 17:17:54 cu0login3 kernel: R13: ffff8101880cc7c0 R14: ffff8101880cc7c0 R15: 0000000000000000 Mar 18 17:17:54 cu0login3 kernel: FS: 00002b58a737ef40(0000) GS:ffff81023f2f13c0(0000) knlGS:00000000557466c0 Mar 18 17:17:54 cu0login3 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 18 17:17:54 cu0login3 kernel: CR2: 00002b58a6c9d3c0 CR3: 0000000198d24000 CR4: 00000000000006e0 Mar 18 17:17:54 cu0login3 kernel: Mar 18 17:17:54 cu0login3 kernel: Call Trace: Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8002ecaf>] prune_dcache+0xe0/0x149 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8004d960>] shrink_dcache_parent+0x1c/0xe1 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff800e8188>] d_invalidate+0x36/0xc4 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d40b>] do_lookup+0x1a0/0x1e6 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d52a>] file_read_actor+0x25/0x154 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000a6d3>] __link_path_walk+0xa01/0xf42 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000ef13>] link_path_walk+0x5c/0xe5 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8000d0ef>] do_path_lookup+0x270/0x2e8 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff80023a64>] __path_lookup_intent_open+0x56/0x97 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8001b3d6>] open_namei+0x73/0x6d5 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff80067c10>] do_page_fault+0x4fe/0x830 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff800278eb>] do_filp_open+0x1c/0x38 Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8001a19b>] do_sys_open+0x44/0xbe Mar 18 17:17:54 cu0login3 kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0 How reproducible: Not reproduced Additional info: Is there a patch which can solve this?
There is no information in this bug about the system configuration, workload, etc. If you see something like this again, please contact your Red Hat Technical Support representative.