Bug 176827

Summary: Altix: Fix sn_flush_device_kernel & spinlock initialization
Product: [Fedora] Fedora Reporter: Prarit Bhargava <prarit>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-01-13 05:07:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 163350    
Attachments:
Description Flags
Patch to fix sn_flush_device_kernel & spinlock initialization none

Description Prarit Bhargava 2006-01-03 14:27:08 UTC
Description of problem: 
 
During system boot, the kernel panicks badly because of an issue where the 
size of a spinlock_t struct is assumed to be one size by the PROM and is 
actually another size in the kernel. 
 
Version-Release number of selected component (if applicable): 
fc5-test1/fc-devel 
 
 
How reproducible: 100% 
 
 
Steps to Reproduce: 
 
1. Boot fc5-test1 
   
Actual results: 
 
ACPI: PCI Interrupt 0003:01:01.0[A]: no GSI 
mptbase: Initiating ioc0 bringup 
ioc0: 53C1030: Capabilities={Initiator,Target} 
scsi1 : ioc0: LSI53C1030, FwRev=01032710h, Ports=1, MaxQ=255, IRQ=62 
ACPI: PCI Interrupt 0003:01:01.1[B]: no GSI 
mptbase: Initiating ioc1 bringup 
ioc1: 53C1030: Capabilities={Initiator,Target} 
BUG: spinlock bad magic on CPU#2, swapper/0 
 lock: e0000030793cc148, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 
 
Call Trace: 
 [<a0000001000107e0>] show_stack+0x80/0xa0 
                                sp=e0000fb004a479f0 bsp=e0000fb004a410e0 
 [<a000000100010830>] dump_stack+0x30/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a410c8 
 [<a000000100341c40>] spin_bug+0x100/0x120 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a410a0 
 [<a000000100341cb0>] _raw_spin_lock+0x50/0x260 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41060 
 [<a00000010072edd0>] _spin_lock_irqsave+0x30/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41038 
 [<a000000100645e60>] sn_dma_flush+0x580/0x660 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41010 
 [<a00000010063fee0>] ___sn_readl+0x40/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40fe0 
 [<a0000001005d1340>] mpt_interrupt+0x60/0xbc0 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40f78 
 [<a0000001000bc530>] handle_IRQ_event+0x90/0x120 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40f38 
 [<a0000001000bc840>] __do_IRQ+0x280/0x380 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40ee0 
 [<a00000010000f810>] ia64_handle_irq+0xf0/0x180 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 
 [<a00000010000bb60>] ia64_leave_kernel+0x0/0x280 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 
 [<a00000010000fd00>] ia64_pal_call_static+0xa0/0xc0 
                                sp=e0000fb004a47d90 bsp=e0000fb004a40e48 
 [<a0000001000113b0>] default_idle+0x110/0x1c0 
                                sp=e0000fb004a47d90 bsp=e0000fb004a40e00 
 [<a000000100011a70>] cpu_idle+0x250/0x2e0 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40da0 
 [<a0000001000562c0>] start_secondary+0x300/0x320 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40d60 
 [<a000000100008240>] __end_ivt_text+0x320/0x350 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40d60 
mptbase: Initiating ioc1 recovery 
BUG: spinlock lockup on CPU#2, swapper/0, e0000030793cc148 
 
Call Trace: 
 [<a0000001000107e0>] show_stack+0x80/0xa0 
                                sp=e0000fb004a479f0 bsp=e0000fb004a410b8 
 [<a000000100010830>] dump_stack+0x30/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a410a0 
 [<a000000100341e80>] _raw_spin_lock+0x220/0x260 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41060 
 [<a00000010072edd0>] _spin_lock_irqsave+0x30/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41038 
 [<a000000100645e60>] sn_dma_flush+0x580/0x660 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a41010 
 [<a00000010063fee0>] ___sn_readl+0x40/0x60 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40fe0 
 [<a0000001005d1340>] mpt_interrupt+0x60/0xbc0 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40f78 
 [<a0000001000bc530>] handle_IRQ_event+0x90/0x120 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40f38 
 [<a0000001000bc840>] __do_IRQ+0x280/0x380 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40ee0 
 [<a00000010000f810>] ia64_handle_irq+0xf0/0x180 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 
 [<a00000010000bb60>] ia64_leave_kernel+0x0/0x280 
                                sp=e0000fb004a47bc0 bsp=e0000fb004a40e98 
 [<a00000010000fd00>] ia64_pal_call_static+0xa0/0xc0 
                                sp=e0000fb004a47d90 bsp=e0000fb004a40e48 
 [<a0000001000113b0>] default_idle+0x110/0x1c0 
                                sp=e0000fb004a47d90 bsp=e0000fb004a40e00 
 [<a000000100011a70>] cpu_idle+0x250/0x2e0 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40da0 
 [<a0000001000562c0>] start_secondary+0x300/0x320 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40d60 
 [<a000000100008240>] __end_ivt_text+0x320/0x350 
                                sp=e0000fb004a47e30 bsp=e0000fb004a40d60 
mptbase: Initiating ioc1 recovery 
mptbase: Initiating ioc1 recovery 
mptbase: Initiating ioc1 recovery 
mptbase: Initiating ioc1 recovery 
mptbase: Initiating ioc1 recovery 
 
System unresponsive at this point. 
 
 
Expected results: 
 
No panic should occur. 
 
Additional info: 
 
I have submitted a patch to fix this upstream and will backport this to 
Fedora...

Comment 1 Prarit Bhargava 2006-01-03 14:30:55 UTC
Sorry -- got the version wrong -- should be 'devel' :( 
 
P. 

Comment 2 Prarit Bhargava 2006-01-03 15:47:51 UTC
Created attachment 122717 [details]
Patch to fix sn_flush_device_kernel & spinlock initialization

This patch separates the sn_flush_device_list struct into kernel and
common (both kernel and PROM accessible) structures.  As it was, if the
size of a spinlock_t changed (due to additional CONFIG options, etc.) the
sal call which populated the sn_flush_device_list structs would erroneously
write data (and cause memory corruption and/or a panic).

This patch does the following:

1.  Removes sn_flush_device_list and adds sn_flush_device_common and
sn_flush_device_kernel.

2.  Adds a new SAL call to populate a sn_flush_device_common struct per
device, not per widget as previously done.

3.  Correctly initializes each device's sn_flush_device_kernel spinlock_t
struct (before it was only doing each widget's first device).

Comment 3 Dave Jones 2006-01-03 21:29:15 UTC
merged in todays CVS, will be in tomorrows build.