Description of problem: During system boot, the kernel panicks badly because of an issue where the size of a spinlock_t struct is assumed to be one size by the PROM and is actually another size in the kernel. Version-Release number of selected component (if applicable): fc5-test1/fc-devel How reproducible: 100% Steps to Reproduce: 1. Boot fc5-test1 Actual results: ACPI: PCI Interrupt 0003:01:01.0[A]: no GSI mptbase: Initiating ioc0 bringup ioc0: 53C1030: Capabilities={Initiator,Target} scsi1 : ioc0: LSI53C1030, FwRev=01032710h, Ports=1, MaxQ=255, IRQ=62 ACPI: PCI Interrupt 0003:01:01.1[B]: no GSI mptbase: Initiating ioc1 bringup ioc1: 53C1030: Capabilities={Initiator,Target} BUG: spinlock bad magic on CPU#2, swapper/0 lock: e0000030793cc148, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 Call Trace: [<a0000001000107e0>] show_stack+0x80/0xa0 sp=e0000fb004a479f0 bsp=e0000fb004a410e0 [<a000000100010830>] dump_stack+0x30/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a410c8 [<a000000100341c40>] spin_bug+0x100/0x120 sp=e0000fb004a47bc0 bsp=e0000fb004a410a0 [<a000000100341cb0>] _raw_spin_lock+0x50/0x260 sp=e0000fb004a47bc0 bsp=e0000fb004a41060 [<a00000010072edd0>] _spin_lock_irqsave+0x30/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a41038 [<a000000100645e60>] sn_dma_flush+0x580/0x660 sp=e0000fb004a47bc0 bsp=e0000fb004a41010 [<a00000010063fee0>] ___sn_readl+0x40/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a40fe0 [<a0000001005d1340>] mpt_interrupt+0x60/0xbc0 sp=e0000fb004a47bc0 bsp=e0000fb004a40f78 [<a0000001000bc530>] handle_IRQ_event+0x90/0x120 sp=e0000fb004a47bc0 bsp=e0000fb004a40f38 [<a0000001000bc840>] __do_IRQ+0x280/0x380 sp=e0000fb004a47bc0 bsp=e0000fb004a40ee0 [<a00000010000f810>] ia64_handle_irq+0xf0/0x180 sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 [<a00000010000bb60>] ia64_leave_kernel+0x0/0x280 sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 [<a00000010000fd00>] ia64_pal_call_static+0xa0/0xc0 sp=e0000fb004a47d90 bsp=e0000fb004a40e48 [<a0000001000113b0>] default_idle+0x110/0x1c0 sp=e0000fb004a47d90 bsp=e0000fb004a40e00 [<a000000100011a70>] cpu_idle+0x250/0x2e0 sp=e0000fb004a47e30 bsp=e0000fb004a40da0 [<a0000001000562c0>] start_secondary+0x300/0x320 sp=e0000fb004a47e30 bsp=e0000fb004a40d60 [<a000000100008240>] __end_ivt_text+0x320/0x350 sp=e0000fb004a47e30 bsp=e0000fb004a40d60 mptbase: Initiating ioc1 recovery BUG: spinlock lockup on CPU#2, swapper/0, e0000030793cc148 Call Trace: [<a0000001000107e0>] show_stack+0x80/0xa0 sp=e0000fb004a479f0 bsp=e0000fb004a410b8 [<a000000100010830>] dump_stack+0x30/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a410a0 [<a000000100341e80>] _raw_spin_lock+0x220/0x260 sp=e0000fb004a47bc0 bsp=e0000fb004a41060 [<a00000010072edd0>] _spin_lock_irqsave+0x30/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a41038 [<a000000100645e60>] sn_dma_flush+0x580/0x660 sp=e0000fb004a47bc0 bsp=e0000fb004a41010 [<a00000010063fee0>] ___sn_readl+0x40/0x60 sp=e0000fb004a47bc0 bsp=e0000fb004a40fe0 [<a0000001005d1340>] mpt_interrupt+0x60/0xbc0 sp=e0000fb004a47bc0 bsp=e0000fb004a40f78 [<a0000001000bc530>] handle_IRQ_event+0x90/0x120 sp=e0000fb004a47bc0 bsp=e0000fb004a40f38 [<a0000001000bc840>] __do_IRQ+0x280/0x380 sp=e0000fb004a47bc0 bsp=e0000fb004a40ee0 [<a00000010000f810>] ia64_handle_irq+0xf0/0x180 sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 [<a00000010000bb60>] ia64_leave_kernel+0x0/0x280 sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 [<a00000010000fd00>] ia64_pal_call_static+0xa0/0xc0 sp=e0000fb004a47d90 bsp=e0000fb004a40e48 [<a0000001000113b0>] default_idle+0x110/0x1c0 sp=e0000fb004a47d90 bsp=e0000fb004a40e00 [<a000000100011a70>] cpu_idle+0x250/0x2e0 sp=e0000fb004a47e30 bsp=e0000fb004a40da0 [<a0000001000562c0>] start_secondary+0x300/0x320 sp=e0000fb004a47e30 bsp=e0000fb004a40d60 [<a000000100008240>] __end_ivt_text+0x320/0x350 sp=e0000fb004a47e30 bsp=e0000fb004a40d60 mptbase: Initiating ioc1 recovery mptbase: Initiating ioc1 recovery mptbase: Initiating ioc1 recovery mptbase: Initiating ioc1 recovery mptbase: Initiating ioc1 recovery System unresponsive at this point. Expected results: No panic should occur. Additional info: I have submitted a patch to fix this upstream and will backport this to Fedora...
Sorry -- got the version wrong -- should be 'devel' :( P.
Created attachment 122717 [details] Patch to fix sn_flush_device_kernel & spinlock initialization This patch separates the sn_flush_device_list struct into kernel and common (both kernel and PROM accessible) structures. As it was, if the size of a spinlock_t changed (due to additional CONFIG options, etc.) the sal call which populated the sn_flush_device_list structs would erroneously write data (and cause memory corruption and/or a panic). This patch does the following: 1. Removes sn_flush_device_list and adds sn_flush_device_common and sn_flush_device_kernel. 2. Adds a new SAL call to populate a sn_flush_device_common struct per device, not per widget as previously done. 3. Correctly initializes each device's sn_flush_device_kernel spinlock_t struct (before it was only doing each widget's first device).
merged in todays CVS, will be in tomorrows build.