Bug 180641
Summary: | System hung: module load/unload | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | James Smart <james.smart> | ||||
Component: | kernel | Assignee: | Tom Coughlan <coughlan> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.0 | CC: | jbaron | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | powerpc | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-06-20 15:55:13 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
James Smart
2006-02-09 17:07:57 UTC
Created attachment 124442 [details]
Attached is the process listing
Here's another occurance. Happened on the insmod this time... -- james ==== I saw this hang again only this time it was in insmod. The common thread however seems to be with pcnet32 and a segment table fault. Here are the stacks for each processor: 0:mon> t [link register ] c00000000004bd80 .__ste_allocate+0xc8/0x15c [c00000000fffba20] 0000000043ecc6d0 (unreliable) [c00000000fffbaa0] c00000000000b314 .do_ste_alloc+0x4/0x70 --- Exception: 401 (Instruction Access) at d00000000019889c .pcnet32_dwio_read_] [c00000000fffbe10] d00000000019cc60 .pcnet32_interrupt+0xc0/0x760 [pcnet32] [c00000000fffbef0] c000000000013798 .handle_irq_event+0x80/0xf8 [c00000000fffbf90] c000000000018634 .call_handle_irq_event+0x14/0x24 [c0000000003a3950] c000000000013c5c .ppc_irq_dispatch_handler+0x278/0x398 [c0000000003a3a10] c000000000013e14 .do_IRQ+0x98/0x100 [c0000000003a3a90] c00000000000ae34 HardwareInterrupt_entry+0x8/0x54 --- Exception: 501 (Hardware Interrupt) at c000000000014550 .default_idle+0x80/4 [c0000000003a3e00] c000000000014954 .cpu_idle+0x38/0x50 [c0000000003a3e70] c00000000000c488 .rest_init+0x78/0x90 [c0000000003a3ef0] c00000000036998c .start_kernel+0x284/0x29c [c0000000003a3f90] c00000000000c0c8 .__setup_cpu_power3+0x0/0x4 0:mon> c1 1:mon> t [c00000000fff7da0] d00000000019e258 .pcnet32_watchdog+0x44/0xac [pcnet32] [c00000000fff7e30] c00000000006bdf8 .run_timer_softirq+0x1c8/0x230 [c00000000fff7ef0] c000000000065704 .__do_softirq+0xa0/0x17c [c00000000fff7f90] c000000000018610 .call_do_softirq+0x14/0x24 [c00000003486a730] c0000000000144a8 .do_softirq+0x74/0x9c [c00000003486a7c0] c0000000000153e0 .timer_interrupt+0x344/0x374 [c00000003486a8a0] c00000000000a2b4 Decrementer_common+0xb4/0x100 --- Exception: 901 (Decrementer) at c0000000001d1590 .serial_in+0x154/0x190 [c00000003486ac10] c0000000001d47b4 .serial8250_console_write+0xa0/0x360 [c00000003486acc0] c00000000005ee78 .__call_console_drivers+0x94/0xc0 [c00000003486ad50] c00000000005f0a0 .call_console_drivers+0x120/0x16c [c00000003486adf0] c00000000005f58c .release_console_sem+0x80/0x13c [c00000003486ae90] c00000000005f40c .vprintk+0x1cc/0x208 [c00000003486af20] c00000000005f22c .printk+0x30/0x44 [c00000003486afa0] d00000000002d4ec .sd_probe+0x344/0x394 [sd_mod] [c00000003486b050] c0000000001d7d90 .bus_match+0x94/0xd8 [c00000003486b0e0] c0000000001d7e4c .device_attach+0x78/0xf8 [c00000003486b170] c0000000001d8318 .bus_add_device+0xc0/0x138 [c00000003486b210] c0000000001d6384 .device_add+0xa4/0x1ec [c00000003486b2b0] d00000000006ea8c .scsi_sysfs_add_sdev+0x17c/0x38c [scsi_mod] [c00000003486b360] d00000000006cb24 .scsi_add_lun+0x388/0x3b4 [scsi_mod] [c00000003486b3f0] d00000000006cd94 .scsi_probe_and_add_lun+0x194/0x250 [scsi_m] [c00000003486b4c0] d00000000006d418 .scsi_report_lun_scan+0x38c/0x40c [scsi_mod] [c00000003486b5d0] d00000000006d7c4 .scsi_scan_target+0xb4/0x114 [scsi_mod] [c00000003486b680] d00000000006d89c .scsi_scan_channel+0x78/0xd8 [scsi_mod] [c00000003486b720] d00000000006da24 .scsi_scan_host_selected+0x128/0x1c0 [scsi_] [c00000003486b7d0] d000000000106224 .lpfc_pci_probe_one+0x5a8/0x700 [lpfc] [c00000003486b890] c000000000179714 .pci_device_probe_static+0x6c/0xa8 [c00000003486b920] c000000000179794 .__pci_device_probe+0x44/0x7c [c00000003486b9b0] c000000000179808 .pci_device_probe+0x3c/0x6c [c00000003486ba40] c0000000001d7d90 .bus_match+0x94/0xd8 [c00000003486bad0] c0000000001d7f3c .driver_attach+0x70/0xe4 [c00000003486bb60] c0000000001d8750 .bus_add_driver+0xf4/0x158 [c00000003486bc00] c0000000001d8e4c .driver_register+0x38/0x4c [c00000003486bc80] c000000000179c3c .pci_register_driver+0x80/0xcc [c00000003486bd10] d0000000001066b4 .lpfc_init+0x54/0x84 [lpfc] [c00000003486bd90] c000000000082b48 .sys_init_module+0x24c/0x4b4 [c00000003486be30] c000000000011280 syscall_exit+0x0/0x18 --- Exception: c01 (System Call) at 000000000ff32190 SP (ffffe510) is in userspace cpu1 is doing the insmod. While this is going on cpu0 begins processing a hardware interrupt for pcnet32. pcnet32_interrupt will acquire a local spin lock and then takes a segment table fault. While all this is going on cpu1 takes a timer interrupt and begins processing pcnet32_watchdog. This will block trying to acquire the same spin lock that cpu0 acquired in pcnet32_interrupt and still owns. At this point we appear to be deadlocked. If you look at the rmmod hang below it's very similiar. cpu1 took a segment table fault while in pcnet32_start_xmit and cpu0 is blocked in pcnet32_interrupt trying to acquire the spin lock that cpu1 owns. At this point I'm suspecting a problem with pcnet32. David, does this look familliar at all? Any chance of a fix in more recent kernels? Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue. |