Bug 215274

Summary: totally busted hotplug cpu locking
Product: [Fedora] Fedora Reporter: Joachim Backes <joachim.backes>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED NEXTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-12-06 16:17:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joachim Backes 2006-11-13 06:23:44 UTC
Description of problem:
Having runnning FC5 on Sun Fire V40Z with 4 Opteron Dual Core, 16 Gbytes memory.
Found very often in the kernel logs the following message:

Nov 11 21:51:01 lindb kernel: BUG: warning at
kernel/cpu.c:56/unlock_cpu_hotplug() (Not tainted)
Nov 11 21:51:01 lindb kernel:
Nov 11 21:51:01 lindb kernel: Call Trace:
Nov 11 21:51:01 lindb kernel:  [<ffffffff80269387>] show_trace+0x34/0x47
Nov 11 21:51:01 lindb kernel:  [<ffffffff802693ac>] dump_stack+0x12/0x17
Nov 11 21:51:01 lindb kernel:  [<ffffffff802a04bc>] unlock_cpu_hotplug+0x47/0x74
Nov 11 21:51:01 lindb kernel:  [<ffffffff802884fa>] sched_getaffinity+0x86/0xab
Nov 11 21:51:01 lindb kernel:  [<ffffffff802ab674>]
compat_sys_sched_getaffinity+0x1d/0x47
Nov 11 21:51:01 lindb kernel:  [<ffffffff8025f2f4>] cstar_do_call+0x1b/0x65
Nov 11 21:51:01 lindb kernel: DWARF2 unwinder stuck at cstar_do_call+0x1b/0x65
Nov 11 21:51:01 lindb kernel: Leftover inexact backtrace:


Version-Release number of selected component (if applicable):
2.6.18-1.2200.fc5

How reproducible:
don't know

Steps to Reproduce:
1.
2.
3.
  
Actual results:
--

Expected results:
--

Additional info:

Comment 1 Dave Jones 2006-11-21 00:36:35 UTC
this needs some pretty extensive surgery that isn't going to land upstream until
at least 2.6.20.

Comment 2 Prarit Bhargava 2006-12-06 15:47:38 UTC
Dave, it's a minor bug.

I posted this upstream and on lkml a while ago but for some reason akpm hasn't
picked it up.  I will reping him today.

(Sorry for the cut-and-paste)

Posted the following patch to lhcs-devel (cut-and-paste):

--- linux-2.6.18.ia64-orig/kernel/cpu.c.orig	2006-10-31 10:57:37.000000000 -0500
+++ linux-2.6.18.ia64/kernel/cpu.c	2006-10-31 10:57:46.000000000 -0500
@@ -58,8 +58,8 @@ void unlock_cpu_hotplug(void)
 		recursive_depth--;
 		return;
 	}
-	mutex_unlock(&cpu_bitmask_lock);
 	recursive = NULL;
+	mutex_unlock(&cpu_bitmask_lock);
 }
 EXPORT_SYMBOL_GPL(unlock_cpu_hotplug);



Comment 3 Prarit Bhargava 2006-12-06 16:17:19 UTC
Fixed in 2.6.19.

P.