Bug 738387

Summary: ipw2200 driver deadlocks with itself trying to take rtnl_mutex
Product: [Fedora] Fedora Reporter: Mads Kiilerich <mads>
Component: kernelAssignee: Josh Boyer <jwboyer>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: awilliam, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, orion, sgruszka, tflink, wwoods
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedBlocker
Fixed In Version: kernel-3.1.0-0.rc6.git0.3.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-24 04:37:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 713564    
Attachments:
Description Flags
udevd killing modprobe
none
dmesg from successful boot with kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686
none
pci ipw lock
none
working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg none

Description Mads Kiilerich 2011-09-14 17:06:23 UTC
Created attachment 523202 [details]
udevd killing modprobe

Starting with kernel-PAE-3.1.0-0.rc5.git0.0.fc16.i686 I have had problems booting several machines. kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686 works fine.

The boot proces stops after systemd says "Started /boot."

50 seconds later udevd starts reporting errors twice a second:
timeout: killing '/sbin/modprobe -bv pci:
- see the attached screenshot.

I guess the modprobe came from
/lib/udev/rules.d/80-drivers.rules:
DRIVER!="?*", ENV{MODALIAS}=="?*", RUN+="/sbin/modprobe -bv $env{MODALIAS}"
and I don't understand why it keeps killing - it should either kill harder or try to continue anyway. That is however not the main problem.

I think I also have seen partial working boots too. Then I ended up with a system without networking. Just running 'ifconfig' hang. Sometimes it could be ctrl-C'ed, sometimes it couldn't.

http://www.smolts.org/client/show/pub_0ee17054-39dd-440a-8a15-98109cec9d28

Comment 1 Mads Kiilerich 2011-09-14 17:09:36 UTC
Created attachment 523203 [details]
dmesg from successful boot with kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686

Comment 2 Josh Boyer 2011-09-14 18:47:19 UTC
Can you try 3.1-rc6?  There were a handful of PCI fixes that went into that release.

Comment 3 Mads Kiilerich 2011-09-14 19:19:16 UTC
I see the same problem with rc5 and rc6. Actually the photo might be from rc6 ... and it is possible that most of the successes has been with rc5.

Comment 4 Mads Kiilerich 2011-09-14 20:47:22 UTC
Created attachment 523255 [details]
pci ipw lock

I caught a lock stracktrace with rc5.

Comment 5 Chuck Ebbert 2011-09-15 06:22:34 UTC
(In reply to comment #4)
> I caught a lock stracktrace with rc5.

It does look like ipw2200 has a serious problem there. I assume blacklisting that lets the system boot normally?

Comment 6 Mads Kiilerich 2011-09-15 09:37:31 UTC
(In reply to comment #5)
> It does look like ipw2200 has a serious problem there. I assume blacklisting
> that lets the system boot normally?

Confirmed. 

Manual "modprobe ipw2200" after boot will hang and bring down the wired network too.

Comment 7 Orion Poplawski 2011-09-15 23:03:43 UTC
Me too.  ThinkPad X32.

Comment 8 Adam Williamson 2011-09-15 23:15:41 UTC
Zoiks!

Comment 9 Josh Boyer 2011-09-15 23:29:48 UTC
John, Stanislaw, is this the fix for this issue?

http://www.spinics.net/lists/linux-wireless/msg76673.html

Comment 10 Josh Boyer 2011-09-16 00:34:42 UTC
I've started a scratch build with the patch from comment #9.  For those of you hitting this issue, could you please test when this completes and let us know if it resolves the problem?

http://koji.fedoraproject.org/koji/taskinfo?taskID=3354471

Comment 11 Adam Williamson 2011-09-16 00:43:10 UTC
I also did a build, if you're really impatient, it's available now:

http://adamwill.fedorapeople.org/kernel-3.1.0-0.rc6.git0.2.1.fc16.x86_64.rpm

Comment 12 Adam Williamson 2011-09-16 00:43:31 UTC
oh, damn, 32-bit might have been a better idea.

Comment 13 Stanislaw Gruszka 2011-09-16 07:28:33 UTC
(In reply to comment #9)
> John, Stanislaw, is this the fix for this issue?
> 
> http://www.spinics.net/lists/linux-wireless/msg76673.html

Yes.

Comment 14 Adam Williamson 2011-09-16 09:16:55 UTC
I've built a 32-bit live image with josh's kernel in it, to aid testing. It's uploading now to:

http://adamwill.fedorapeople.org/desktop-20110914-i686.iso

(ignore the dumb name :>) 

it should be up in about 30 mins. sha256sum is d1c58cac71e69ddbe1f50f6ebfdc25ba64dcfd078db0fde920a0aba0e34d06c3 .

Comment 15 Mads Kiilerich 2011-09-16 09:24:58 UTC
Created attachment 523524 [details]
working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg

I confirm that 3.1.0-0.rc6.git0.2.1 from koji can boot and the wireless seems to work.

Comment 16 Josh Boyer 2011-09-16 11:42:05 UTC
(In reply to comment #15)
> Created attachment 523524 [details]
> working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg
> 
> I confirm that 3.1.0-0.rc6.git0.2.1 from koji can boot and the wireless seems
> to work.

Excellent.  I'll commit this today.

Comment 17 Fedora Update System 2011-09-16 13:58:17 UTC
kernel-3.1.0-0.rc6.git0.3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.1.0-0.rc6.git0.3.fc16

Comment 18 Tim Flink 2011-09-17 02:27:13 UTC
Discussed in the 2011-09-16 blocker review meeting. Even though this bug is HW specific, that hardware is common enough and this bug is serious enough to accept it as a Fedora 16 beta blocker as it violates the following alpha release criterion [1]:

The installed system must be able to download and install updates with yum and the default graphical package manager in all release-blocking desktops 

[1] https://fedoraproject.org/wiki/Fedora_16_Alpha_Release_Criteria

Comment 19 Fedora Update System 2011-09-17 19:33:21 UTC
Package kernel-3.1.0-0.rc6.git0.3.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.1.0-0.rc6.git0.3.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/kernel-3.1.0-0.rc6.git0.3.fc16
then log in and leave karma (feedback).

Comment 20 Adam Williamson 2011-09-22 00:53:56 UTC
Mads, can you re-confirm this is fixed in git0.3? thanks!

Comment 21 Mads Kiilerich 2011-09-22 01:02:48 UTC
Yes, git0.3 is fine - I gave it karma in bodhi.

Comment 22 Adam Williamson 2011-09-22 01:19:09 UTC
great, setting VERIFIED then.

Comment 23 Adam Williamson 2011-09-22 22:44:42 UTC
Please don't close bugs until they are actually pushed stable when we're frozen, Dave - otherwise they risk not being pulled into future Beta composes; we don't have closed bugs on the list when listing out updates which need to be pulled into the compose that are not yet in stable.

Comment 24 Fedora Update System 2011-09-24 04:36:49 UTC
kernel-3.1.0-0.rc6.git0.3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.