Bug 738387 - ipw2200 driver deadlocks with itself trying to take rtnl_mutex
Summary: ipw2200 driver deadlocks with itself trying to take rtnl_mutex
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Josh Boyer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Depends On:
Blocks: F16Beta, F16BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2011-09-14 17:06 UTC by Mads Kiilerich
Modified: 2011-09-24 04:37 UTC (History)
10 users (show)

Fixed In Version: kernel-3.1.0-0.rc6.git0.3.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-09-24 04:37:25 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
udevd killing modprobe (514.30 KB, image/jpeg)
2011-09-14 17:06 UTC, Mads Kiilerich
no flags Details
dmesg from successful boot with kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686 (72.29 KB, text/plain)
2011-09-14 17:09 UTC, Mads Kiilerich
no flags Details
pci ipw lock (1.25 MB, image/jpeg)
2011-09-14 20:47 UTC, Mads Kiilerich
no flags Details
working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg (75.41 KB, text/plain)
2011-09-16 09:24 UTC, Mads Kiilerich
no flags Details

Description Mads Kiilerich 2011-09-14 17:06:23 UTC
Created attachment 523202 [details]
udevd killing modprobe

Starting with kernel-PAE-3.1.0-0.rc5.git0.0.fc16.i686 I have had problems booting several machines. kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686 works fine.

The boot proces stops after systemd says "Started /boot."

50 seconds later udevd starts reporting errors twice a second:
timeout: killing '/sbin/modprobe -bv pci:
- see the attached screenshot.

I guess the modprobe came from
/lib/udev/rules.d/80-drivers.rules:
DRIVER!="?*", ENV{MODALIAS}=="?*", RUN+="/sbin/modprobe -bv $env{MODALIAS}"
and I don't understand why it keeps killing - it should either kill harder or try to continue anyway. That is however not the main problem.

I think I also have seen partial working boots too. Then I ended up with a system without networking. Just running 'ifconfig' hang. Sometimes it could be ctrl-C'ed, sometimes it couldn't.

http://www.smolts.org/client/show/pub_0ee17054-39dd-440a-8a15-98109cec9d28

Comment 1 Mads Kiilerich 2011-09-14 17:09:36 UTC
Created attachment 523203 [details]
dmesg from successful boot with kernel-PAE-3.1.0-0.rc4.git0.1.fc16.i686

Comment 2 Josh Boyer 2011-09-14 18:47:19 UTC
Can you try 3.1-rc6?  There were a handful of PCI fixes that went into that release.

Comment 3 Mads Kiilerich 2011-09-14 19:19:16 UTC
I see the same problem with rc5 and rc6. Actually the photo might be from rc6 ... and it is possible that most of the successes has been with rc5.

Comment 4 Mads Kiilerich 2011-09-14 20:47:22 UTC
Created attachment 523255 [details]
pci ipw lock

I caught a lock stracktrace with rc5.

Comment 5 Chuck Ebbert 2011-09-15 06:22:34 UTC
(In reply to comment #4)
> I caught a lock stracktrace with rc5.

It does look like ipw2200 has a serious problem there. I assume blacklisting that lets the system boot normally?

Comment 6 Mads Kiilerich 2011-09-15 09:37:31 UTC
(In reply to comment #5)
> It does look like ipw2200 has a serious problem there. I assume blacklisting
> that lets the system boot normally?

Confirmed. 

Manual "modprobe ipw2200" after boot will hang and bring down the wired network too.

Comment 7 Orion Poplawski 2011-09-15 23:03:43 UTC
Me too.  ThinkPad X32.

Comment 8 Adam Williamson 2011-09-15 23:15:41 UTC
Zoiks!

Comment 9 Josh Boyer 2011-09-15 23:29:48 UTC
John, Stanislaw, is this the fix for this issue?

http://www.spinics.net/lists/linux-wireless/msg76673.html

Comment 10 Josh Boyer 2011-09-16 00:34:42 UTC
I've started a scratch build with the patch from comment #9.  For those of you hitting this issue, could you please test when this completes and let us know if it resolves the problem?

http://koji.fedoraproject.org/koji/taskinfo?taskID=3354471

Comment 11 Adam Williamson 2011-09-16 00:43:10 UTC
I also did a build, if you're really impatient, it's available now:

http://adamwill.fedorapeople.org/kernel-3.1.0-0.rc6.git0.2.1.fc16.x86_64.rpm

Comment 12 Adam Williamson 2011-09-16 00:43:31 UTC
oh, damn, 32-bit might have been a better idea.

Comment 13 Stanislaw Gruszka 2011-09-16 07:28:33 UTC
(In reply to comment #9)
> John, Stanislaw, is this the fix for this issue?
> 
> http://www.spinics.net/lists/linux-wireless/msg76673.html

Yes.

Comment 14 Adam Williamson 2011-09-16 09:16:55 UTC
I've built a 32-bit live image with josh's kernel in it, to aid testing. It's uploading now to:

http://adamwill.fedorapeople.org/desktop-20110914-i686.iso

(ignore the dumb name :>) 

it should be up in about 30 mins. sha256sum is d1c58cac71e69ddbe1f50f6ebfdc25ba64dcfd078db0fde920a0aba0e34d06c3 .

Comment 15 Mads Kiilerich 2011-09-16 09:24:58 UTC
Created attachment 523524 [details]
working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg

I confirm that 3.1.0-0.rc6.git0.2.1 from koji can boot and the wireless seems to work.

Comment 16 Josh Boyer 2011-09-16 11:42:05 UTC
(In reply to comment #15)
> Created attachment 523524 [details]
> working 3.1.0-0.rc6.git0.2.1.fc16.i686.PAE dmesg
> 
> I confirm that 3.1.0-0.rc6.git0.2.1 from koji can boot and the wireless seems
> to work.

Excellent.  I'll commit this today.

Comment 17 Fedora Update System 2011-09-16 13:58:17 UTC
kernel-3.1.0-0.rc6.git0.3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.1.0-0.rc6.git0.3.fc16

Comment 18 Tim Flink 2011-09-17 02:27:13 UTC
Discussed in the 2011-09-16 blocker review meeting. Even though this bug is HW specific, that hardware is common enough and this bug is serious enough to accept it as a Fedora 16 beta blocker as it violates the following alpha release criterion [1]:

The installed system must be able to download and install updates with yum and the default graphical package manager in all release-blocking desktops 

[1] https://fedoraproject.org/wiki/Fedora_16_Alpha_Release_Criteria

Comment 19 Fedora Update System 2011-09-17 19:33:21 UTC
Package kernel-3.1.0-0.rc6.git0.3.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.1.0-0.rc6.git0.3.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/kernel-3.1.0-0.rc6.git0.3.fc16
then log in and leave karma (feedback).

Comment 20 Adam Williamson 2011-09-22 00:53:56 UTC
Mads, can you re-confirm this is fixed in git0.3? thanks!

Comment 21 Mads Kiilerich 2011-09-22 01:02:48 UTC
Yes, git0.3 is fine - I gave it karma in bodhi.

Comment 22 Adam Williamson 2011-09-22 01:19:09 UTC
great, setting VERIFIED then.

Comment 23 Adam Williamson 2011-09-22 22:44:42 UTC
Please don't close bugs until they are actually pushed stable when we're frozen, Dave - otherwise they risk not being pulled into future Beta composes; we don't have closed bugs on the list when listing out updates which need to be pulled into the compose that are not yet in stable.

Comment 24 Fedora Update System 2011-09-24 04:36:49 UTC
kernel-3.1.0-0.rc6.git0.3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.