The 2.6.25.6-55 kernel generates this BUG message very frequently on my Dell Latitude D620 laptop: BUG: sleeping function called from invalid context at include/asm/semaphore_64.h:104 in_atomic():0, irqs_disabled():1 Pid: 66, comm: kacpi_notify Not tainted 2.6.25.6-55.fc9.x86_64 #1 Call Trace: [__might_sleep+180/182] __might_sleep+0xb4/0xb6 [acpi_os_wait_semaphore+120/240] acpi_os_wait_semaphore+0x78/0xf0 [acpi_ut_acquire_mutex+62/130] acpi_ut_acquire_mutex+0x3e/0x82 [acpi_ex_enter_interpreter+11/43] acpi_ex_enter_interpreter+0xb/0x2b [acpi_evaluate_object+456/511] acpi_evaluate_object+0x1c8/0x1ff [acpi_evaluate_integer+147/209] acpi_evaluate_integer+0x93/0xd1 [_end+112757364/2114620868] :libata:ata_acpi_handle_hotplug+0xdb/0x227 [acpi_bus_get_status+57/144] ? acpi_bus_get_status+0x39/0x90 [acpi_get_data+94/112] ? acpi_get_data+0x5e/0x70 [acpi_bus_check_device+78/119] ? acpi_bus_check_device+0x4e/0x77 [acpi_os_execute_deferred+0/57] ? acpi_os_execute_deferred+0x0/0x39 [_end+112757749/2114620868] :libata:ata_acpi_dev_notify+0x18/0x1a [acpi_ev_notify_dispatch+95/108] acpi_ev_notify_dispatch+0x5f/0x6c [acpi_os_execute_deferred+44/57] acpi_os_execute_deferred+0x2c/0x39 [run_workqueue+132/268] run_workqueue+0x84/0x10c [worker_thread+221/238] worker_thread+0xdd/0xee [autoremove_wake_function+0/56] ? autoremove_wake_function+0x0/0x38 [worker_thread+0/238] ? worker_thread+0x0/0xee [kthread+73/118] kthread+0x49/0x76 [child_rip+10/18] child_rip+0xa/0x12 [kthread+0/118] ? kthread+0x0/0x76 [child_rip+0/18] ? child_rip+0x0/0x12 The system is still responsive, but since one of the CPU cores is pegged at 100%, the laptop is all but useless unless it's on AC power. Reverting to kernel 2.6.25.3-18 makes the problem go away. (2.6.25.4-30 might work as well, but I'm not using that kernel due to bug 450158.)
If it matters, the laptop was docked at the time.
is this reproducible using any of the upstream kernel.org kernels, or is this specific to fedora?
I built and am now running a generic 2.6.25.6 kernel. I'll report back on my results.
After moderately extensive testing, I cannot reproduce this problem with the vanilla 2.6.25.6 kernel; the problem only occurs with the Fedora 2.6.25.6-55 kernel. I can now say that his problem does not depend on docking; I experienced the problem after a cold boot, suspend, and unsuspend sequence. This problem may (and I suspect, does) depend on running a suspend/unsuspend sequence, as I have not been able to reproduce the problem from a cold boot. Is there any other information I can gather/provide that would be helpful?
2.6.25.9-76.fc9 is broken in the same way. I'm now rebuilding a local version of 2.6.25.9-76.fc9 with all ACPI-related patches removed, to see if that fixes the problem...
I've been unable to reproduce the problem with the 2.6.25.9-76 kernel I built locally. The patches I removed were: linux-2.6-acpi-fix-sizeof.patch linux-2.6-acpi-fix-error-with-external-methods.patch linux-2.6-eeepc-laptop-base.patch linux-2.6-eeepc-laptop-backlight.patch linux-2.6-eeepc-laptop-fan.patch linux-2.6-libata-acpi-hotplug-fixups.patch linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch linux-2.6-libata-pata_atiixp-dont-disable.patch I'm going to start adding these patches back in until I can find which ones trigger the problem.
Created attachment 311123 [details] patch: don't call sleeping function from invalid context this bug is introduced by 664d080c41463570b95717b5ad86e79dc1be0877. please try this patch. :)
IMO, commit 664d080c41463570b95717b5ad86e79dc1be0877 should equal "linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch".
I built another version of 2.6.25.9-76 with all of the ACPI patches I listed in comment 6 added back in, plus the patch in comment 8. I'll test this kernel to see if I can reproduce the problem...
Ok... with the patch in comment 7, the problem is still there, but slightly different: kacpi_notify and kacpid are still chewing the CPU after I resume from a suspend, but I'm not seeing the BUG tracebacks anymore. Perhaps the BUG tracebacks and kacpi_notify/kacpid crushing the CPU are actually different bugs? Barring any better suggestions, I'm going to start adding the ACPI-related patches back in until I can figure out which ones are causing problems...
2.6.25.10-86 from updates-testing still has kacpi_notify/kacpid max CPU problem. Will continue patch isolation against this kernel.
After some fairly extensive testing, I've narrowed the problem down to this patch: linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch When this patch is applied, when I resume my Dell Latitude D620 laptop from a suspend, two things happen: 1. The BUG message I listed in the Description of this bug is logged frequently. 2. The kacpi_notify and kacpid processes begin to consume CPU constantly. Zhang Rui's patch in comment 7 (thanks Zhang!) fixes the first problem (the BUG messages) but not the second (kacpi_notify/kacpid constantly consuming the CPU). Would anyone like to pass a critical eye over linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch and see if they can spot what's causing problem #2? (I have no familiarity with the ACPI code; about all I can do now is to start adding printk() debugging statements until I can figure out where the lose is. That's going to be a slow process...)
Please try 2.6.25.11-97; at least one fix for that problem went in.
I'm not a Fedora user, however, I can confirm the kacpi_notify/kacpid issue with vanilla 2.6.26 x86_64 on a Dell Latitude D820. git bisect confirmed that the bug has been introduced by commit 664d080c41463570b95717b5ad86e79dc1be0877 [libata] ACPI: Properly handle bay devices in dock stations I've also tested with today's git master (commit 024e8ac04453b3525448c31ef39848cf675ba6db) with the same results.
could you please run "grep . /sys/firmware/acpi/interrupts/*" with 1. patch "ACPI: Properly handle bay devices in dock stations " applied, before hotplug 2. patch "ACPI: Properly handle bay devices in dock stations " applied, after hotplug 3. patch "ACPI: Properly handle bay devices in dock stations " not applied, before hotplug 4. patch "ACPI: Properly handle bay devices in dock stations " not applied, after hotplug
With 2.6.25.11-97, kacpi_notify/kacpid still exhibit the "runaway CPU consumption after resume" problem. Zhang: understood; I am collecting this data now...
The kacpi thrashing after resume sounds like the same behaviour as what I've been seeing in kernels after kernel-2.6.25.4-30.fc9.i686. However since it wasn't related to a hotplug of docking stations I posted a new bugreport: Bug 451896 If I can help with more testing etc. please just let me know. Cheers, Rich
Bug 451896 is almost certainly a dupe of this one, as I do NOT have to perform a warm dock/undock to reproduce the bug. Simply suspending and then resuming triggers the bug, regardless of the docking status (either at suspend or resume time).
Then please also do this test, kill acpid cat /proc/acpi/event suspend and attach the output after resume.
I get this: button/sleep SBTN 00000080 00000001 ac_adapter AC 00000080 00000001 battery BAT0 00000081 00000001 battery BAT1 00000081 00000001 battery BAT0 00000080 00000001 battery BAT1 00000080 00000001
Created attachment 312909 [details] ACPI interrupts, no patch, before suspend The output of "grep . /sys/firmware/acpi/interrupts/*", without the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch, before suspend.
Created attachment 312910 [details] ACPI interrupts, no patch, after resume from suspend The output of "grep . /sys/firmware/acpi/interrupts/*", without the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch, after resume from suspend.
Created attachment 312911 [details] ACPI events, no patch The output of "cat /proc/acpi/event" during a suspend/resume, without the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch.
Created attachment 312912 [details] ACPI interrupts, with patch, before suspend The output of "grep . /sys/firmware/acpi/interrupts/*", with the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch, before suspend.
Created attachment 312913 [details] ACPI interrupts, with patch, after resume from suspend The output of "grep . /sys/firmware/acpi/interrupts/*", with the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch, after resume from suspend.
Created attachment 312914 [details] ACPI events, with patch The output of "cat /proc/acpi/event" during a suspend/resume, with the linux-2.6-libata-acpi-handle-bay-devices-in-dock-stations.patch.
Zhang, I believe I've attached everything you asked for. If there's any additional information you need, please let me know...
(In reply to comment #27) > Zhang, I believe I've attached everything you asked for. yes, thanks, :) > If there's any > additional information you need, please let me know... It seems that there is an interrupt storm caused by GPE 0x19. It would be great if you can attach the acpidump so that we can know what GPE 0x19 is for. you can use the latest pmtools at http://www.lesswatts.org/patches/linux_acpi/
Created attachment 313011 [details] output of "acpidump" on my Dell Latitude D620 If it matters, note that when I run acpidump, it prints this to stderr: Wrong checksum for TCPA!
Created attachment 313445 [details] avoid checking _STA method Can someone please test the attached patch with the D620. This fixes the suspend to disk issue with 2.6.26 on my system.
I'm now testing the patch in comment 30; I'll report back on the results.
The patch in comment 30 does NOT resolve the problem of GPE 0x19 causing an interrupt storm after resume from suspend. ACPI folks (Zhang et. al.): is there anything additional I can do to help you resolve this bug?
I am having the same problem on a Dell Latitude D620. Updated BIOS to A10 to make sure there wasn't a known hardware problem, made no difference. 2.6.25.4-30.fc9 is the most recent kernel that I don't experience a problem on. Newer kernels (from at least 2.6.25.14-108.fc9) up to and including 2.6.27-0.314.rc5.git9.fc10 exhibit the same problem with kacpid and kacpi_notify using near 100% of the cpu after a resume. Backing out of the patch noted in the title of this report appears to un-break the later 2.6.25.* kernels, but it seems that patch has been mainlined as it's no longer in the 2.6.27 rpms but the functionality (breakage) is there. Is there any information I can add to help in fixing this?
well, there is a patch set which changes the dock driver a lot and fixes a couple of things. http://marc.info/?l=linux-acpi&m=121988834907742&w=2 it would be great if you can give them a try. :) note that there are 11 patches in all.
Success! I've applied all of those patches except #2 (it removes code that isn't in this kernel) and I'm now able to resume without kacpid/kapci_notify gunning the CPU. This is on 2.6.26.3-29.fc9.x86_64. I'll continue to use this to make sure it's actually fixed, but the problem happened consistently before. I'm not sure that I've tried this exact kernel rpm (26.3-29) before without the patches, but all the one I have tried have failed. I can also try to figure out which of those patches actually fixed the problem. (I think some are dependant on others, not sure.) I'm attaching the patch I used, which is just patches 1 & 3-11 off the linux-acpi mailing list, with a whitespace tweak to one of the Makefile diffs. This applies cleanly as the last patch to the 2.6.26.3-29 rpm.
Created attachment 316628 [details] Patches from linux-acpi list that resolve issue on 2.6.26.3-29
Is that entire patch needed to fix this problem?
Nope. At the time I posted that I had only tested them all together. It seems the actual fix is contained in patch #4, but patches #1 and #3 are required for #4 to apply cleanly.
I can also confirm that this fixes the problem on my D610. I patched the (previously not working) fedora kernel-2.6.26.3-29 with 1, 3 and 4 (obviously in that order) from the patch-set above (http://marc.info/?l=linux-acpi&m=121988834907742&w=2 ), and followed https://fedoraproject.org/wiki/Docs/CustomKernel to build a custom kernel rpm kernel-2.6.26.3-29.local.d610.acpifix.fc9.i686.rpm to install from. I'm now happily running that patched kernel on my D610, while posting here... I'll post again if I have any problems, but don't expect any (touch wood)!
hmm, one question. does hot dock/undock work well on your laptop before the patch set applied? if no, does this patch fix the problem? thanks for testing. :)
I never had a station to dock it to... so I can't help testing that sorry :-P
that's okay. James, can you help me do the test in comment #40 please?
(In reply to comment #40) > hmm, one question. > does hot dock/undock work well on your laptop before the patch set applied? > if no, does this patch fix the problem? I've tested the newest F9 kernel 2.6.26.3-29 with the mentioned patches (1, 3 and 4) and I'd like to confirm that both problems are fixed for the Dell Laptop Latitude D610: 1. docking works now (before the kernel always froze whenever I've re-docked the laptop (although undocking always worked fine)) 2. kacpi kernel threads don't consume 100% CPU anymore after waking up from suspend (bug: #451896 ) It would be great if these 3 patches could be integrated into the official fedora kernel.
I'm actually running Rawhide at the moment, and I'm stuck at kernel-2.6.27-0.166.rc0.git8 due to bug 462540. I'll check to see what patches from the patchset mentioned in comment #34 have landed in the latest Rawhide kernel. (I suspect the answer is "none of them", though.)
Ok, now that I've hacked around bug 462540... The answer is indeed "none of them"; in the latest rawhide kernel, kacpi_notify and kacpid still consume the CPU after resume. I'll build a custom Rawhide kernel with the patches in comment 39 and report back.
I've built kernel-2.6.27-0.370.rc8.0.local with all of the patches in comment #34. Results: kacpid/kapci_notify behave normally upon resume; they do not consume the CPU. Hot docking now works. (Before, this would cause a hang.) Hot undocking [still] works. Hot bay device swapping [still] works. I'll continue to run these patches to see if I can uncover any problems, but so far so good!
Are there any news regarding this bug? It would be great if these patches would find their way into Fedora's official kernel. This would help a lot so that I wouldn't have to recompile the kernel each time it is updated. ;-) It looks like that some people have tested the patches and they work (at least for the Dell D610) very good and solve a couple of problems (docking & CPU usage after resume). So if there is anything what I can do to help to get these patches applied to Fedora's kernel please tell me and I'll do my best to help.
(In reply to comment #47) > Are there any news regarding this bug? It would be great if these patches would > find their way into Fedora's official kernel. This would help a lot so that I > wouldn't have to recompile the kernel each time it is updated. ;-) > Does this new 2.6.27.4 kernel fix it? http://koji.fedoraproject.org/koji/buildinfo?buildID=68158
kernel-2.6.27.4-19.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.4-19.fc9
Provisionally, yes. I have downloaded and installed that new kernel-2.6.27.4-19.fc9.i686 (and required kernel-firmware.noarch) and the suspend resume works again - at least the first time. I can't comment on the docking issue. But the acpi fix appears to be included in this kernel. w00t!
(In reply to comment #48) > (In reply to comment #47) > > Are there any news regarding this bug? It would be great if these patches would > > find their way into Fedora's official kernel. This would help a lot so that I > > wouldn't have to recompile the kernel each time it is updated. ;-) > > > > Does this new 2.6.27.4 kernel fix it? > > http://koji.fedoraproject.org/koji/buildinfo?buildID=68158 Partially: ;-) Problem 1: - kacpid eats up 100% CPU -> fixed Problem 2: - docking (undocking usually works) crashes kernel: -> _not_ fixed, after undocking and re-docking 2 times the complete laptop still crashes -> I've tried again 2.6.26.6-79.fc9.i686 with the mentioned 3 patches and in this case docking works always fine
The patches that fix the rest of this problem are merged upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=drivers/acpi/dock.c;h=5b30b8d91d716126cdd3305d61ee8fd625eea016;hb=HEAD If we could figure out what the minimum patchset to fix this is, that would be great. ;)
Well, the minimum patchset is going to be contained in the patches from Shaohua Li that were merged on 2008-09-24, as that's the patchset I've been testing. However, given that all of these patches are now officially upstream, is it really worth it to try to find the exact minimum set necessary to fix hot [un]docking? If the patches cause breakage, we're going to hit it sooner rather than later, so I'd almost rather see the patchset hit updates-testing as-is... Also, as of 2.6.27.4-68, problem 1 (kacpid eats CPU on resume) still isn't fixed in Rawhide...
kernel-2.6.27.4-24.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.4-24.fc9
kernel-2.6.27.4-26.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.4-26.fc9
I've filed a new bug (bug 470321) against rawhide and added F10Blocker as a blocker, to ensure bug is fixed for F10 before the release. Chuck, any thoughts on what I proposed in comment 53?
kernel-2.6.27.4-26.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9467
(In reply to comment #56) > Chuck, any thoughts on what I proposed in comment #53? We can't put that big of an update in a stable kernel. If it causes new bugs they won't get fixed until 2.6.30... I'll put the first two patches in and see how much that helps.
The first two patches that went into the 2.6.28 dock driver are in kernel-2.6.27.5-29.fc9 .
Can someone please try 2.6.27.5-30.fc9 from koji? http://koji.fedoraproject.org/koji/buildinfo?buildID=68916
kernel-2.6.27.5-32.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-32.fc9
kernel-2.6.27.5-32.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9583
kernel-2.6.27.5-37.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-37.fc9
kernel-2.6.27.5-41.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-41.fc9
kernel-2.6.27.5-41.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.