Bug 190592

Summary: spinlock bad magic init'ing udev
Product: [Fedora] Fedora Reporter: Steve Grubb <sgrubb>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-05-07 12:23:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
test patch none

Description Steve Grubb 2006-05-03 19:35:23 UTC
Description of problem:
Starting udev: BUG: spinlock bad magic on CPU#0, events/0/5 (Not tainted)
 lock: ffff810037c90280, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Kernel panic - not syncing: bad locking

Version-Release number of selected component (if applicable):
kernel builds 2181 & 2185

How reproducible:
Most of the time it locks up. Adding dontpanic to kernel boot leads to spinlock
lockup on CPU#0. It sometimes boots though.


Steps to Reproduce:
1. Boot
  
Actual results:
Locks up requiring a power cycle to get going again.

Expected results:
No lockup.

Comment 1 Steve Grubb 2006-05-03 22:09:24 UTC
I modified the start_udev script to find out what is happening. It seems that
calling udevtrigger is the culprit. If I add --dry-run, then I do not get the
spinlock bad magic and boot procedes. Not everything works, though. I am trying
to narrow down the subsystem more but I don't know exactly how. There's about
179 events getting replayed. Any ideas ?

Comment 2 Steve Grubb 2006-05-07 01:19:42 UTC
I modified the kernel a little to get some info. The call chain seems to be
include/linux/netdevice.h:895, spinlock.c:95.

I also added a printk( "%s", dev->name) to line 895 of netdevice.h. When it
fails to boot, it outputs "eth%d <0>". When bootup succeeds, it prints "eth0 <6>
SoftMAC".

This seems like there is a race where sometimes it inited and sometimes its not.
Probably needs some sync mechanism. Does this help narrow things down any?

Comment 3 Alexander Viro 2006-05-07 12:05:59 UTC
Created attachment 128706 [details]
test patch

Comment 4 Alexander Viro 2006-05-07 12:10:14 UTC
I see what's going on here - bcm43xx_attach_board() does
schedule_work(), causing execution of ieee80211softmac_assoc_work()
and does that before the caller of that sucker gets to registering
net_dev.  I'm not familiar with ieee80211 code, so suggested variant
may be BS and definitely needs an ACK from somebody who knows the
area; said that, try the patch attached above and see what happens.

Comment 5 David Woodhouse 2006-05-07 12:23:13 UTC

*** This bug has been marked as a duplicate of 190776 ***