| Summary: | block devices not flushed predictably | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Pádraig Brady <p> |
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
| Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 15 | CC: | alevy, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-07-11 17:49:08 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Pádraig Brady
2011-10-17 00:39:20 UTC
Gah, I just tried another usb stick there, and no my whole desktop is locked up I noticed this in dmesg while the dd process was stuck in D state: "iwlagn 0000:09:00.0: Failed to alloc_pages with GFP_ATOMIC. Only 0 free buffers remaining" So looks like some mem entity is exhausted in the kernel? I could switch to a different virtual terminal. I could not kill -9 the dd process. Upon removing the USB stick, everything "unstuck" again. You should really use oflag=direct when doing dd to a block device like that. I added oflag=direct which "fixed" things. I could ^C the dd process with immediate effect No stalls (mem exhaustion) I'm guessing it's just bypassing some broken kernel logic, though I'm struggling to see what logic could be invoked by the kernel for a process that doesn't use O_DIRECT but is writing directly to a block device. Perhaps the kernel is trying to schedule the writes to the device? Though that seems a bit daft given the device isn't used by anything else. Perhaps the kernel is trying to cache the writes, but that seems daft too. Anyway here is a stack trace that happens during the stalls. [711520.363875] swapper: page allocation failure: order:1, mode:0x4020 [711520.363894] Pid: 0, comm: swapper Not tainted 2.6.40.4-5.fc15.x86_64 #1 [711520.363897] Call Trace: [711520.363898] <IRQ> [<ffffffff810dfe56>] warn_alloc_failed+0x111/0x13d [711520.363911] [<ffffffff810e1c20>] ? __free_pages_ok+0xc9/0xd8 [711520.363916] [<ffffffff810e2e09>] __alloc_pages_nodemask+0x6c5/0x736 [711520.363921] [<ffffffff8110dc72>] alloc_pages_current+0xbe/0xd8 [711520.363926] [<ffffffff810834ff>] ? arch_local_irq_save+0x15/0x1b [711520.363947] [<ffffffffa02a764f>] iwlagn_rx_allocate+0x91/0x268 [iwlagn] [711520.363956] [<ffffffffa02a79de>] iwlagn_rx_replenish_now+0x18/0x24 [iwlagn] [711520.363963] [<ffffffffa029bf85>] iwl_irq_tasklet+0x911/0xad8 [iwlagn] [711520.363968] [<ffffffff8105a0b3>] tasklet_action+0x7f/0xd2 [711520.363972] [<ffffffff8105a954>] __do_softirq+0xc9/0x1b5 [711520.363977] [<ffffffff8148804e>] ? _raw_spin_lock+0xe/0x10 [711520.363981] [<ffffffff8148fa5c>] call_softirq+0x1c/0x30 [711520.363984] [<ffffffff8100abb9>] do_softirq+0x46/0x81 [711520.363987] [<ffffffff8105ac36>] irq_exit+0x57/0xb1 [711520.363990] [<ffffffff814902de>] do_IRQ+0x8e/0xa5 [711520.363994] [<ffffffff81488493>] common_interrupt+0x13/0x13 [711520.363996] <EOI> [<ffffffff81072e8c>] ? enqueue_hrtimer+0x3f/0x7a [711520.364004] [<ffffffff812841b0>] ? intel_idle+0xd8/0x100 [711520.364007] [<ffffffff81284192>] ? intel_idle+0xba/0x100 [711520.364012] [<ffffffff813af949>] cpuidle_idle_call+0xd7/0x168 [711520.364016] [<ffffffff81008307>] cpu_idle+0xa5/0xdf [711520.364020] [<ffffffff8146741e>] rest_init+0x72/0x74 [711520.364023] [<ffffffff81b66b8b>] start_kernel+0x3ca/0x3d5 [711520.364027] [<ffffffff81b662c4>] x86_64_start_reservations+0xaf/0xb3 [711520.364030] [<ffffffff81b66140>] ? early_idt_handlers+0x140/0x140 [711520.364033] [<ffffffff81b663ca>] x86_64_start_kernel+0x102/0x111 [711520.364037] iwlagn 0000:09:00.0: Failed to alloc_pages with GFP_ATOMIC. Only 0 free buffers remaining. Are you still seeing this stack trace with 2.6.43/3.3? I don't have the bug reported in the summary, but I do have almost the same backtrace, reporting it here instead of creating a new bug since I don't actually have a problem other then this trace. If a new bug would still be helpful please let me know and I'll do it. I'm getting the same error in my logs, no locking involved, everything seems to work just fine. F17, 3.3.4-5.fc17.x86_64, Thinkpad T510: [507785.954572] Pid: 0, comm: swapper/0 Tainted: G W 3.3.4-5.fc17.x86_64 #1 [507785.954575] Call Trace: [507785.954577] <IRQ> [<ffffffff811236cb>] warn_alloc_failed+0xeb/0x150 [507785.954595] [<ffffffff81130729>] ? wakeup_kswapd+0x69/0x140 [507785.954599] [<ffffffff81127173>] __alloc_pages_nodemask+0x663/0x920 [507785.954649] [<ffffffffa049aa60>] ? ieee80211_rx+0x310/0xa20 [mac80211] [507785.954655] [<ffffffff81086df6>] ? ttwu_do_activate.constprop.80+0x66/0x70 [507785.954660] [<ffffffff8115eed4>] alloc_pages_current+0xa4/0x110 [507785.954677] [<ffffffffa053eb70>] iwlagn_rx_allocate+0xf0/0x270 [iwlwifi] [507785.954689] [<ffffffffa05401ad>] iwl_irq_tasklet+0xc2d/0xd30 [iwlwifi] [507785.954694] [<ffffffff8105d6e3>] tasklet_action+0x73/0x120 [507785.954697] [<ffffffff8105dc90>] __do_softirq+0xc0/0x1e0 [507785.954702] [<ffffffff815f4d9c>] call_softirq+0x1c/0x30 [507785.954706] [<ffffffff81015465>] do_softirq+0x75/0xb0 [507785.954709] [<ffffffff8105e055>] irq_exit+0xb5/0xc0 [507785.954712] [<ffffffff815f5603>] do_IRQ+0x63/0xe0 [507785.954716] [<ffffffff815ebc2e>] common_interrupt+0x6e/0x6e [507785.954718] <EOI> [<ffffffff813181d6>] ? intel_idle+0xe6/0x150 [507785.954725] [<ffffffff813181b7>] ? intel_idle+0xc7/0x150 [507785.954732] [<ffffffff8149ddd7>] cpuidle_idle_call+0xb7/0x230 [507785.954735] [<ffffffff8101220f>] cpu_idle+0xaf/0x120 [507785.954740] [<ffffffff815c9eae>] rest_init+0x72/0x74 [507785.954746] [<ffffffff81aedc00>] start_kernel+0x3ba/0x3c5 [507785.954750] [<ffffffff81aed346>] x86_64_start_reservations+0x131/0x135 [507785.954753] [<ffffffff81aed44a>] x86_64_start_kernel+0x100/0x10f [507785.954758] iwlwifi 0000:03:00.0: Failed to alloc_pages with GFP_ATOMIC.Only 0 free buffers remaining. (In reply to comment #5) > I don't have the bug reported in the summary, but I do have almost the same > backtrace, reporting it here instead of creating a new bug since I don't > actually have a problem other then this trace. If a new bug would still be > helpful please let me know and I'll do it. > > I'm getting the same error in my logs, no locking involved, everything seems > to work just fine. F17, 3.3.4-5.fc17.x86_64, Thinkpad T510: Basically the kernel ran out of atomic memory and the wireless driver needed to do an allocation. Your kernel is already tainted with some other warning though, so something else happened before this. If you can recreate this using the 3.4.4-5.fc17 update that went out a couple of days ago please open a new bug with the full dmesg and what was happening on the machine at the time. OK, I'll give it a go, thanks for the fast response. Unrelated: Just randomly stumbled on bug 822877, hope it isn't still the case of high idle load. Fedora 15 has reached it's end of life as of June 26, 2012. As a result, we will not be fixing any remaining bugs found in Fedora 15. In the event that you have upgraded to a newer release and the bug you reported is still present, please reopen the bug and set the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. Thank you for taking the time to file a report. We hope newer versions of Fedora suit your needs. |