Bug 199667
Summary: | ext3 file system crashed in my IA64 box | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | bibo,mao <bibo.mao> | ||||||||||||||
Component: | kernel | Assignee: | Eric Sandeen <esandeen> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | 4.0 | CC: | andrew.patterson, bibo.mao, bjorn.helgaas, jarod, jbaker, jbaron, lori.carlson, lwang, nicholas.dokos, rick.hester | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | ia64 | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | RHEL4U4 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2006-09-20 19:35:45 UTC | Type: | --- | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
bibo,mao
2006-07-21 09:18:01 UTC
I debuged this problem, watched register and memory content, it seems that there is one sentence in function journal_dirty_metadata() struct journal_head *jh = bh2jh(bh); jh value is NULL, and I wrote one patch ltp stress test can pass but I am not familiar with EXT3 filesystem, I do not know whether it is root cause. --- linux-2.6.9/fs/jbd/transaction.c.orig 2006-06-30 14:05:58.000000000 +0800 +++ linux-2.6.9/fs/jbd/transaction.c 2006-07-07 02:56:32.000000000 +0800 @@ -1104,13 +1104,15 @@ int journal_dirty_metadata(handle_t *han { transaction_t *transaction = handle->h_transaction; journal_t *journal = transaction->t_journal; - struct journal_head *jh = bh2jh(bh); + struct journal_head *jh; - jbd_debug(5, "journal_head %p\n", jh); - JBUFFER_TRACE(jh, "entry"); if (is_handle_aborted(handle)) goto out; + jh = journal_add_journal_head(bh); + jbd_debug(5, "journal_head %p\n", jh); + JBUFFER_TRACE(jh, "entry"); + jbd_lock_bh_state(bh); /* @@ -1154,6 +1156,7 @@ int journal_dirty_metadata(handle_t *han spin_unlock(&journal->j_list_lock); out_unlock_bh: jbd_unlock_bh_state(bh); + journal_put_journal_head(jh); out: JBUFFER_TRACE(jh, "exit"); return 0; I tested on RHEL4-U3, I will test it on RHEL4-U5. Created attachment 134829 [details]
nat consumption panic
Created attachment 134830 [details]
another nat consumption panic
This and the previous attachment are panics on a large HP ia64
system (64 socket, dual-core Montecito, 1TB RAM), running the
RHEL4 U3 largesmp kernel and an HP internal I/O stress test.
We can reproduce this failure reliably (within 12 hours), and
it is a serious problem for shipping Montecito servers.
This is the same problem reported in Issue Tracker 100177. That sighting was on a much smaller ia64 system (2 socket dual-core Montecito) running the normal RHEL4 U3 kernel. Created attachment 134833 [details] upstream patch to fix JBD race in t_forget list handling Although this should certainly not lead to a kernel panic, do you know the root causes of the IO errors reported in the log in comment #7 and comment #8 ? SCSI error : <11 0 4 1> return code = 0x6000000 end_request: I/O error, dev sdeo, sector 31437897 Does the attached patch help? I don't know the root cause of the I/O errors preceding the panic. I will put the patch into RHEL4 U3 and try to reproduce the problem. We should have some results by tomorrow. It sounds like any customer-shippable kernel with this fix would be a post U4 kernel. So we'll want that before doing testing beyond this specific issue. I haved met the I/O errors before, maybe the test case assumes too many file system operation so that disk has some error. I/O error phenomenon did not appear again after I changed one new disk. I applied this patch but bug still occured. There are two different test cases here, right - the bug was opened for an ltp stress test, while the hp folks are using an internal test? Bibo,Mao had earlier reported that the patch did not help in the ltp case, but it at least seems worth a shot in the hp case. It also looks like different kernel versions are in play here. They may well be the same root cause but it's not immediately clear yet. (hm, interesting though that all 3 pasted oopses are down the sys_mkdir path....) Thanks, -Eric Yes, HP is seeing the problem when using a different test case. HP is using "hazard", an internal I/O stress test. Intel is using an LTP stress test. Bibo, you reported seeing the problem even with the patch. Can you confirm that the patch you're testing is the one in comment #10 (not the one in comment #2)? How long does it take you to reproduce the problem? The LKML mail here: http://lkml.org/lkml/2006/7/25/61 mentions three days, but it must happen sooner sometimes, if you've already seen it with the comment #10 patch. HP is currently testing the comment #10 patch. Without the patch, we were seeing failures after 12 hours or so of hazard testing. Created attachment 134946 [details]
another nat consumption panic, on rx2620 with montecito
We saw this on an rx2620 with two Montecitos, threads enabled, 32GB.
About four hours into the rhr2-2.0.3 memory cert test. Note this
the backtrace is slightly different (came through open/create, not
mkdir), but the offset into journal_dirty_metadata and the null pointer
in r18 are identical.
Bjorn, Can you grab a vmcore from the 2620? Bjorn, Is the oops in comment #15 on a stock kernel, or patched? The crash in comment #15 (rx2620 with two Montecitos) is with stock RHEL4 U3. We don't have a vmcore for that crash. When testing RHEL4 U3 + the comment #10 patch on SandDune with 64 Montecitos, we saw the same crash again after about 12 hours of hazard. Ok, thanks for trying it, anyway. Suggestion: try the following patch from RHEL4U4 Patch1049: linux-2.6.9-ext3-jbd-race.patch It is supposed to eliminate some races that *might* help with this problem. I've already suggested it to Bjorn, but I thought I'd add a note here to see if Mao Bibo could test as well. He did mention that he would test on U5 (which presumably has the patch - I have not checked) but I don't think he reported results. Out of curiosity, has an upstream kernel been checked with this testcase? (2.6.17, or 2.6.18 rcX?) That might be a lot to ask, but it might help to know whether we should be looking for a problem that has already been fixed. Bibo, Two questions: 1. Do you have a way of reproducing this problem in a short period of time (<<12 hours)? 2. Can you provide an update on how your testing with U4 (for this problem) is progressing? Regards, Ron Bibo, Two questions: 1. Do you have a way of reproducing this problem in a short period of time (<<12 hours)? 2. Can you provide an update on how your testing with U4 (for this problem) is progressing? Regards, Ron Created attachment 135093 [details] ext3 patch from RHEL4 U4 Just for convenience, here's the RHEL4U4 patch referred to in Comment #23 It's from this upstream LKML discussion: http://lkml.org/lkml/2005/3/8/147 I'm having a bit of a hard time finding my way through the twisty maze of inlines in journal_dirty_metadata, but I believe that the problem at journal_dirty_metadata+0x221/0x540 is around this part of the code: if (jh->b_transaction == transaction && jh->b_jlist == BJ_Metadata) { We oops at: 3000: 0b 90 80 46 00 21 [MMI] adds r18=32,r35;; Where did r35 come from? backing up to the top of the function, 2df6: f0 00 86 00 42 c0 adds r15=64,r33 r33 was the buffer_head passed in (2nd arg), b_private is at offset 64, b_private is the jh later we do 2e06: 30 02 3c 30 20 00 ld8 r35=[r15] so now the jh is in r35 and if we're oopsing at: 3000: 0b 90 80 46 00 21 [MMI] adds r18=32,r35;; we're trying to look at offset 32 from jh, which is jh->b_transaction. Now, the patch mentioned in Comment #23 above and attached in Comment #29 had this in Stephen's original analysis: "A truncate running in parallel can lead to journal_unmap_buffer() destroying the jh if it occurs between these two calls." If jh were destroyed and null, then this would match what is seen in the oops. We oops just after this line in assembly: 3000: 0b 90 80 46 00 21 [MMI] adds r18=32,r35;; and r18 in the oops is: r18 : 0000000000000020 *b_transaction in a journal head is at offset 0x20 (32). So if "jh" were NULL, and we tried to read jh->b_transaction, we'd try to read memory address 0x20, which would cause this panic. So I think there's a good case to be made that we are seeing a null jh here, and if the patch from RHEL4U4 seems to fix it, it probably is indeed the right fix for this case. Created attachment 135099 [details] ext3 jbd crashed on RHEL4.3 I tested on RHEL4-U4, system crashed in different place. This time it seems the same with bug reported in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158363, I will add patch attached in this website for RHEL4-U4 and test again. I will modify LTP test case to check whether it can shorten execution time fo reproduce this bug. (In reply to comment #26) > Bibo, > Two questions: > 1. Do you have a way of reproducing this problem in a short period of time (<<12 hours)? Now I am modifying LTP test script and check it whether it can shorten time of this problem. > 2. Can you provide an update on how your testing with U4 (for this problem) is > progressing? I am testing RHEL4-U4 with patch attached in https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158363. I tested in Madison machine with 4 physical CPU, system did not crash. On RHEL4.3 when I compiled ext3 filesystem with compile-in option and without CONFIG_DEBUG_SPINLOCK and CONFIG_DEBUG_SPINLOCK_SLEEP option, system did not crash. But when I compiled kernel with default option in SRPM package in Montecito with 4 physical cpu, dual-core and hyperthread function, system crashed. In general with 48 hours system crashed. Just ran into a similar crash on an on modified RHEL4 U3. This crash is in fs/jbd/transaction.c though. ssertion failure in do_get_write_access() at fs/jbd/transaction.c:608: "jh->b_next_transaction == ((void *)0)" kernel BUG at fs/jbd/transaction.c:608! diskfs[21164]: bugcheck! 0 [1] Assertion failure in do_get_write_access() at fs/jbd/transaction.c:608: "jh->b_next_transaction == ((void *)0)" kernel BUG at fs/jbd/transaction.c:608! Modules linked in: md5 ipv6 parport_pc lp parport dev_acpi(U) autofs4 sunrpc ds yenta_socket pcmcia_core scsi_dump diskdump zlib_deflate mptctl(U) lpfcdfc(U) vfat fat dm_mod button ohci_hcd ehci_hcd shpchp tg3 e1000 s2io sg sr_mod ext3 jbd qla2400(U) qla2300(U) qla2xxx(U) qla2xxx_conf(U) lpfc(U) scsi_transport_fc cciss mptspi(U) mptscsih(U) mptbase(U) sd_mod scsi_mod Pid: 21164, CPU 10, comm: diskfs psr : 0000101008126010 ifs : 8000000000000794 ip : [<a0000002000f0b90>] Not tainted ip is at do_get_write_access+0xbb0/0x11c0 [jbd] unat: 0000000000000000 pfs : 0000000000000794 rsc : 0000000000000003 rnat: 0000000000000158 bsps: 0000073b2cea0f24 pr : 0000001805659959 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000002000f0b90 b6 : a00000010006ff00 b7 : a000000100259f40 f6 : 1003e0000000000001200 f7 : 1003e8080808080808081 f8 : 1003e00000000000023dc f9 : 1003e000000000e58b20e f10 : 1003e000000003571d99d f11 : 1003e44b831eee7285baf r1 : a0000001009add90 r2 : 0000000000000001 r3 : 0000000000100000 r8 : 0000000000000028 r9 : 0000000000000001 r10 : e00007000002510c r11 : 0000000000000003 r12 : e000070fe4537c50 r13 : e000070fe4530000 r14 : 0000000000004000 r15 : a000000100742ac0 r16 : a000000100742ac8 r17 : e000070ff405fde8 r18 : e000070ff405802c r19 : e000070000025100 r20 : e0000700000247c0 r21 : 0000000000000002 r22 : 0000000000000001 r23 : e000070ff4058040 r24 : e000070000025860 r25 : e000070000025858 r26 : e000070000025838 r27 : 0000000000000074 r28 : 0000000000000074 r29 : 00000000ffffffff r30 : e000070ff4058080 r31 : 0000000000000000 Call Trace: [<a000000100016b20>] show_stack+0x80/0xa0 sp=e000070fe45377c0 bsp=e000070fe4531628 [<a000000100017430>] show_regs+0x890/0x8c0 sp=e000070fe4537990 bsp=e000070fe45315d8 [<a00000010003dbf0>] die+0x150/0x240 sp=e000070fe45379b0 bsp=e000070fe4531598 [<a00000010003dd20>] die_if_kernel+0x40/0x60 sp=e000070fe45379b0 bsp=e000070fe4531568 [<a00000010003dec0>] ia64_bad_break+0x180/0x600 sp=e000070fe45379b0 bsp=e000070fe4531540 [<a00000010000f540>] ia64_leave_kernel+0x0/0x260 sp=e000070fe4537a80 bsp=e000070fe4531540 [<a0000002000f0b90>] do_get_write_access+0xbb0/0x11c0 [jbd] sp=e000070fe4537c50 bsp=e000070fe45314a0 [<a0000002000f1660>] journal_get_write_access+0x60/0xa0 [jbd] sp=e000070fe4537cb0 bsp=e000070fe4531460 [<a00000020018ca70>] add_dirent_to_buf+0x4f0/0x7c0 [ext3] sp=e000070fe4537cb0 bsp=e000070fe45313d8 [<a00000020018cee0>] ext3_add_entry+0x1a0/0x1240 [ext3] sp=e000070fe4537cc0 bsp=e000070fe45312c0 [<a00000020018e2b0>] ext3_add_nondir+0x30/0x100 [ext3] sp=e000070fe4537d90 bsp=e000070fe4531288 [<a00000020018e5b0>] ext3_create+0x230/0x240 [ext3] sp=e000070fe4537d90 bsp=e000070fe4531238 [<a000000100146000>] vfs_create+0x260/0x380 sp=e000070fe4537da0 bsp=e000070fe45311d8 [<a0000001001475a0>] open_namei+0xe20/0xf00 sp=e000070fe4537da0 bsp=e000070fe4531150 [<a00000010011e380>] filp_open+0x80/0x140 sp=e000070fe4537db0 bsp=e000070fe4531110 [<a00000010011e9d0>] sys_open+0xd0/0x1a0 sp=e000070fe4537e30 bsp=e000070fe4531090 [<a00000010000f3e0>] ia64_ret_from_syscall+0x0/0x20 sp=e000070fe4537e30 bsp=e000070fe4531090 [<a000000000010640>] 0xa000000000010640 sp=e000070fe4538000 bsp=e000070fe4531090 From comment #32 and comment #33, it's my understanding that RHEL4U4 resolves the original issue, thanks to the linux-2.6.9-ext3-jbd-race.patch in U4. However, it sounds like then, testing runs into the bug in 158363. Can HP do testing with an RHEL4U4 kernel, with default configuration options, plus the patch in comment #26 in bug 158363*, to see if the test runs reliably with that set of code? Thanks, -Eric *https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=130424 Sorry, I don't follow you, Eric. Comment 32 says RHEL4U4 didn't crash on Madison. Comment 33 says RHEL4U3 crashed with a different panic (I think this update should probably be on a different bugzilla entry). Neither one tells me whether RHEL4U4 fixes the original issue on Montecito. That said, my opinion is that you are right, and the patch in comment #29 (also included in RHEL4U4) probably DOES resolve the problem. We've started testing RHEL4U4, and we haven't seen the problem yet. Of course, we are continuing that testing and our confidence will increase with time. But I would feel a lot better if the author of the patch (Stephen, I think you said) looked at this crash and confirmed that it looks like a manifestation of the bug he fixed. I just don't know enough about the filesystem code to have an opinion. Bibo, Can you report back the following: 1. Which processor you used when you originally reported this bug (Madison, Montecito or both) 2. If you verified U4+patch on Madison, Montecito or both. Comment #32 (Thank you for the response) indicates Madison Ugh, sorry, I meant to reference comment #31 and comment #32 comment #31 says that RHEL4U4 crashed in a -different- place, likely due to the other bug referenced. That's why I was interested in the combination of RHEL4U4 plus the patch from the other bug. comment #32 says: 1) stock U4 kernel did not crash on madison 2) Rebuilt U3 with more debugging options did not crash (on what?) (although the debugging probably changes timing, and could well change or avoid racey problems) 3) Rebuilt "kernel with default option in SRPM package in Montecito with 4 physical cpu, dual-core and hyperthread function, system crashed." I can't tell what kernel we're talking about in 3), or what hardware was tested. Also, I believe that my analysis in comment #30 makes a very strong case that you are seeing a null journal head, which is addressed by the patch in U4. (In reply to comment #36) > Bibo, > Can you report back the following: > 1. Which processor you used when you originally reported this bug (Madison, > Montecito or both) Sorry for the confusion, originally this bug is generated on Montecito machine, and test case passed on Madison machine. Previously I mainly tested in RHEL4.3 kernel version with default configuration option in SRPM, Montecito failed and Madison passed. When I changed kernel option, Montecito also passed, I did not test changed configuration option in Madison. > 2. If you verified U4+patch on Madison, Montecito or both. Comment #32 (Thank > you for the response) indicates Madison Now I am verifing U4 + patch on Montecito machine, Comment #32 indicates Montecito, it hit another bug. I did not tested U4 kernel on Madison machine, because my Madison machine can not hit this bug, laterly I run LTP test cases on Montecito machine always. (In reply to comment #37) > comment #32 says: > 1) stock U4 kernel did not crash on madison Sorry I did not state it clearly, stock U3 kernel did not crash on madison, I did not verify U4 kenrel on madison. > 2) Rebuilt U3 with more debugging options did not crash (on what?) > (although the debugging probably changes timing, and could well change or > avoid racey problems) Rebuilt U3 with tiger_defconfig option, system did not crash on Montecito, laterly in order to reconstruct this bug I alway tested stock kernel with default option on Montecito. > 3) Rebuilt "kernel with default option in SRPM package in Montecito with 4 > physical cpu, dual-core and hyperthread function, system crashed." > I can't tell what kernel we're talking about in 3), or what hardware was tested. > Also, I believe that my analysis in comment #30 makes a very strong case that > you are seeing a null journal head, which is addressed by the patch in U4. yes, your analysis in comment #30 is right, I ever debug this nat consumption problem, jh pointer is empty. But I do not know whethter patch in U4 can fix this problem. Now I am testing U4+patch on Montecito machine. I seemed that U4+patch passed on my Montecito machine, I will test again. Bibo, Thanks for your testing! This is good news. Regards, Ron We've completed a 96 hour hazard run on RHEL4 U4 with no u320 drives (see note below), with no problem found. This same configuration on RHEL4 U3 typically crashed in 12 hours or so. I'm pretty confident that that the patch you identified is the fix, Eric. Can you double-check with Stephen that a null pointer dereference in journal_data_metadata() is one manifestation of the bug that he fixed? When running hazard on RHEL4 U4 with u320 drives, we did see a panic due to a null pointer dereference at scsi_request_fn+0x730. This occurred during a "task abort", and we believe it is related to this issue tracker: https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=93312 Bjorn/Rick, Any updates on the ETA for the driver for 93312? Regards, Ron Re: 93312, we have not identified a root cause of the MPT Fusion driver problems yet. I tested LTP stress test cases two times against kernel rpm package on website http://people.redhat.com/~jbaron/rhel4 for three days's stress test, it both passed. Also bug on website https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158363 did not happen against this kernel version. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Just to make sure everyone is on the same page - I think the consensus is that this particular bug is -already- fixed in RHEL4 U4. HP folks, do you concur? > Just to make sure everyone is on the same page - I think the consensus
> is that this particular bug is -already- fixed in RHEL4 U4. HP folks,
> do you concur?
Yes.
Closing based on customer feedback that this is resolved in RHEL4U4. |