Bug 52486 - kernel panic with heavy I/O
Summary: kernel panic with heavy I/O
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: ia64
OS: Linux
high
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-08-24 05:31 UTC by Shinya Narahara
Modified: 2005-10-31 22:00 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-09-06 14:12:23 UTC
Embargoed:


Attachments (Terms of Use)

Description Shinya Narahara 2001-08-24 05:31:57 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [ja] (WinNT; U)

Description of problem:
kernel 2.4.3-12smp and 2.4.7-2smp have a possibility to be panic.
SMP kernel only, UP kernel doesn't have.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. Attach MANY I/O board onto IA-64 machine.
2. Boot with smp kernel.
3. Make very heavy I/O.
	

Actual Results: Kernel panic.

Expected Results:  Done normally

Additional info:

The stopped point is build_script() in arch/ia64/kernel/unwind.c.
This is in function which search an unwind table from the list.

>        for (table = unw.tables; table; table = table->next) {
>                if (ip >= table->start && ip < table->end) {           <-- Here

We need spinlock to control this list on SMP kernel. Because
unw_remove_unwind_table() function is delete the element from
this list, and can be drived by another CPU.

The quick hack for kernel-2.4.3-12smp is below:

--- arch/ia64/kernel/unwind.c.original	Wed Aug  1 06:54:24 2001
+++ arch/ia64/kernel/unwind.c	Thu Aug  2 04:49:40 2001
@@ -1398,6 +1398,7 @@
 	struct unw_insn insn;
 	u8 *dp, *desc_end;
 	u64 hdr;
+	unsigned long flags;
 	int i;
 	STAT(unsigned long start, parse_start;)
 
@@ -1420,6 +1421,8 @@
 	/* search the kernels and the modules' unwind tables for IP: */
 
 	STAT(parse_start = ia64_get_itc());
+	
+	spin_lock_irqsave(&unw.lock, flags);
 
 	for (table = unw.tables; table; table = table->next) {
 		if (ip >= table->start && ip < table->end) {
@@ -1427,6 +1430,9 @@
 			break;
 		}
 	}
+
+	spin_unlock_irqrestore(&unw.lock, flags);
+
 	if (!e) {
 		/* no info, return default unwinder (leaf proc, no mem stack, no saved regs)  */
 		dprintk("unwind: no unwind info for ip=0x%lx (prev ip=0x%lx)\n", ip,
@@ -2057,3 +2063,13 @@
 			return -EFAULT;
 	return unw.gate_table_size;
 }

Comment 1 Glen Foster 2001-08-24 15:50:49 UTC
This defect is considered MUST-FIX for Fairfax.

Comment 2 Arjan van de Ven 2001-09-06 14:12:18 UTC
Looks like a problem indeed; patch added to the kernel.


Comment 3 Arjan van de Ven 2001-09-06 14:17:27 UTC
will show up in 2.4.9-0.1 or later in rawhide


Note You need to log in before you can comment on or make changes to this bug.