Bug 472752
| Summary: | BUG() in end_buffer_async_write() | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Bryn M. Reeves <bmr> |
| Component: | kernel | Assignee: | Jeff Moyer <jmoyer> |
| Status: | CLOSED ERRATA | QA Contact: | Eryu Guan <eguan> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.7 | CC: | eguan, jmoyer, qcai, tao, tumeya, vgoyal |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-02-16 15:34:20 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This was fixed upstream with the following three commits:
commit f0fbd5fc09b20f7ba7bc8c80be33e39925bb38e1
Author: Andrew Morton <akpm>
Date: Thu May 5 16:15:48 2005 -0700
[PATCH] __block_write_full_page() simplification
The `last_bh' logic probably isn't worth much. In those situations where only
the front part of the page is being written out we will save some looping but
in the vastly more common case of an all-page writeout if just adds more code.
Nick Piggin <nickpiggin.au>
Signed-off-by: Andrew Morton <akpm>
Signed-off-by: Linus Torvalds <torvalds>
commit 05937baae9fc27b64bcd4378da7d2b14edf7931c
Author: Andrew Morton <akpm>
Date: Thu May 5 16:15:47 2005 -0700
[PATCH] __block_write_full_page speedup
Remove all those get_bh()'s and put_bh()'s by extending lock_page() to cover
the troublesome regions.
(get_bh() and put_bh() happen every time whereas contention on a page's lock
in there happens basically never).
Cc: Nick Piggin <nickpiggin.au>
Signed-off-by: Andrew Morton <akpm>
Signed-off-by: Linus Torvalds <torvalds>
commit ad576e63e0c8b274a8558b8e05a6d0526e804dc0
Author: Nick Piggin <nickpiggin.au>
Date: Thu May 5 16:15:46 2005 -0700
[PATCH] __block_write_full_page race fix
When running
fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
on an ext2 filesystem with 1024 byte block size, on SMP i386 with 4096 byte
page size over loopback to an image file on a tmpfs filesystem, I would
very quickly hit
BUG_ON(!buffer_async_write(bh));
in fs/buffer.c:end_buffer_async_write
It seems that more than one request would be submitted for a given bh
at a time.
What would happen is the following:
2 threads doing __mpage_writepages on the same page.
Thread 1 - lock the page first, and enter __block_write_full_page.
Thread 1 - (eg.) mark_buffer_async_write on the first 2 buffers.
Thread 1 - set page writeback, unlock page.
Thread 2 - lock page, wait on page writeback
Thread 1 - submit_bh on the first 2 buffers.
=> both requests complete, none of the page buffers are async_write,
end_page_writeback is called.
Thread 2 - wakes up. enters __block_write_full_page.
Thread 2 - mark_buffer_async_write on (eg.) the last buffer
Thread 1 - finds the last buffer has async_write set, submit_bh on that.
Thread 2 - submit_bh on the last buffer.
=> oops.
So change __block_write_full_page to explicitly keep track of the last bh
we need to issue, so we don't touch anything after issuing the last
request.
Signed-off-by: Nick Piggin <nickpiggin.au>
Signed-off-by: Andrew Morton <akpm>
Signed-off-by: Linus Torvalds <torvalds>
ad576e63e0c8b274a8558b8e05a6d0526e804dc0 is the fix, the remaining two are cleanups of the surrounding code.
I grabbed a test machine to try to reproduce this, and I hit a different BUG:
------------[ cut here ]------------
kernel BUG at fs/mpage.c:417!
invalid operand: 0000 [#1]
SMP
Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc cpufreq_powersave loop joydev button battery ac ehci_hcd uhci_hcd i5000_edac edac_mc hw_random bnx2 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata cciss sd_mod scsi_mod
CPU: 4
EIP: 0060:[<c017aed1>] Not tainted VLI
EFLAGS: 00010202 (2.6.9-78.23.ELsmp)
EIP is at mpage_writepage+0xd0/0x56f
eax: 0000001c ebx: c140d560 ecx: 0000000c edx: 00000000
esi: d7a4ba70 edi: d7a4ba70 ebp: 00000000 esp: c8d8fdb4
ds: 007b es: 007b ss: 0068
Process fsstress (pid: 10153, threadinfo=c8d8f000 task=e1d72df0)
Stack: 00152afe 00000000 00000000 00000000 00000000 00000000 00000000 00000004
00000001 00000001 00000004 0000000a e341ae50 e341af00 c0196e79 c140d560
00000000 00000082 c203ad80 c011d31c 0000001b 00000089 0000022b 00000006
Call Trace:
[<c0196e79>] ext2_get_block+0x0/0x2cb
[<c011d31c>] try_to_wake_up+0x28e/0x299
[<c01c5356>] radix_tree_gang_lookup_tag+0x3d/0x53
[<c014187b>] find_get_pages_tag+0x28/0x64
[<c01418b0>] find_get_pages_tag+0x5d/0x64
[<c017b569>] mpage_writepages+0x1f9/0x314
[<c0196e79>] ext2_get_block+0x0/0x2cb
[<c0197ebb>] ext2_update_inode+0x30a/0x315
[<c01463fe>] do_writepages+0x19/0x27
[<c0179e70>] __sync_single_inode+0x5f/0x1bb
[<c017a22d>] sync_sb_inodes+0x1a7/0x274
[<c017a44a>] sync_inodes_sb+0x72/0x83
[<c02df172>] __cond_resched+0x14/0x39
[<c017a51c>] sync_inodes+0x16/0x6c
[<c015dd3a>] do_sync+0x11/0x55
[<c015dd88>] sys_sync+0xa/0xd
[<c02e0c43>] syscall_call+0x7/0xb
[<c02e007b>] __lock_text_end+0x60c/0x1071
Code: 8b 03 f6 c4 10 0f 84 f6 00 00 00 8b 03 f6 c4 10 75 08 0f 0b 9b 01 ec 60 2f c0 8b 44 24 3c 31 ed 8b 70 0c 89 f7 8b 07 a8 04 74 08 <0f> 0b a1 01 ec 60 2f c0 8b 07 a8 10 75 21 8b 07 a8 02 0f 85 1e
<0>Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception
That bug is in mpage_writepage:
if (page_has_buffers(page)) {
struct buffer_head *head = page_buffers(page);
struct buffer_head *bh = head;
/* If they're all mapped and dirty, do it */
page_block = 0;
do {
BUG_ON(buffer_locked(bh)); <==================
This could be related, but I'll have to think about it a bit more. I'll retry with a patched kernel to make sure the issue has been resolved.
I re-ran the test with a kernel that was patched with the above 3 commits and I still get this failure. I'll continue investigating. I opened up a separate bugzilla (480404) for the problem I ran into. Once that's fixed, hopefully I can reproduce the bug reported here. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 89.44.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ This is hard to reproduce, I followed steps in comment 0 and let it run for 8 hours and on different hosts, I can hit bug 480404 but not this one. Ran reproducer on 2.6.9-96 kernel over the weekend(about 60 hours) and no issue was found. linux-2.6.9-fs-buffer-__block_write_full_page-race-fix.patch linux-2.6.9-fs-buffer-__block_write_full_page-speedup-by-removin.pattch linux-2.6.9-fs-buffer-__block_write_full_page-simplification-by-.pattch are applied in 2.6.9-96 kernel correctly. Set it to SanityOnly An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |
Description of problem: There is a race in 2.6.9's handling of bh submission in __block_write_full_page() that will allow more than one request to be submitted for a given bh at the same time. This causes end_buffer_async_write() to blow up on the following bug: 685 struct buffer_head *tmp; 686 struct page *page; 687 688 BUG_ON(!buffer_async_write(bh)); 689 This happens in the 2nd invocation of end_buffer_async_write() as the 1st caller has already cleared the bit later on in the same function: 705 spin_lock_irqsave(&page_uptodate_lock, flags); 706 clear_buffer_async_write(bh); 707 unlock_buffer(bh); 708 tmp = bh->b_this_page; Version-Release number of selected component (if applicable): Originally reported on 2.6.9-34.EL but confirmed to also affect later kernels. How reproducible: Difficult. Seen in production at a couple of sites every few months. Steps to Reproduce: Nick Piggin posted a reproducer to LKML but we've been unable to trigger the problem reliably with this test case: "When running fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2 on an ext2 filesystem with 1024 byte block size, on SMP i386 with 4096 byte page size over loopback to an image file on a tmpfs filesystem, I would very quickly hit BUG_ON(!buffer_async_write(bh)); in fs/buffer.c:end_buffer_async_write" Actual results: BUG Expected results: No BUG. Additional info: Discussed on LKML here: http://lkml.org/lkml/2005/4/24/145 [[patch] __block_write_full_page bug] http://lkml.org/lkml/2005/4/27/85 [[patch] fix the 2nd buffer race properly] http://lkml.org/lkml/2005/4/30/181 [[patch] alternative fix for VFS race]