From Bugzilla Helper: User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux) (KHTML, like Gecko) Description of problem: Using IBM's "pounder" test (sorry its not distributable), I've been seeing hangs w/ RHEL4 Beta2. So far it looks like the hangs have occurred only on an IBM x360 and possibly an IBM x440 (although a different x440 passed without a problem). Looking over vmstat/slabinfo/meminfo logs from the test run, I don't seen any lowmem exhaustion (lowmem sticks around 2megs). The hangs are somewhat odd. The system responds to pings but is not ssh'able. In atleast one case the keyboard numlock worked and the mouse still would move in X, however the GDM application would not respond to keypresses. Version-Release number of selected component (if applicable): kernel-smp-2.6.9-1.648_EL How reproducible: Sometimes Steps to Reproduce: 1. Install RHEL4 beta2 2. Run pounder test case 3. Wait about 2-3 hours Actual Results: System hangs, but is pingable. X even responds to mouse movement, but applications don't seem to respond (GDM doesn't get keypresses and displays time of the hang, ssh doesn't work). Expected Results: System continues running w/o issue. Additional info: I'm working to narrow down the issue to a single test I can distribute.
Created attachment 106613 [details] conosle panic image Yes, I am the king of annoying bug reports! Here is a camera capture of the panic seen on the system. I'm going to try to get a netconsole dump to lessen my lameness.
More usable text based output captured from the service processor: esi: c97fb6fc edi: 00000000 ebp: f7e8a600 esp: c349ade4 ds: 007b es: 007b ss: 0068 Process cp (pid: 27211, threadinfo=c349a000 task=f5115770) Stack: cc115828 00000000 c97fb6fc 00000000 ed3390f8 f88ec9df c97fb6fc f7e92200 00000000 cc115828 ffffff86 00000000 00000007 f4e0f8f4 ce664000 ce663020 f88ec755 ce663000 00000000 ed3390b4 ed339028 00000007 00000001 00001000 Call Trace: [<f88ec9df>] ext3_xattr_set_handle2+0x23d/0x417 [ext3] [<f88ec755>] ext3_xattr_set_handle+0x6db/0x728 [ext3] [<f88ecc03>] ext3_xattr_set+0x4a/0x83 [ext3] [<f88ee132>] ext3_xattr_security_set+0x3c/0x83 [ext3] [<c016f716>] generic_setxattr+0x48/0x50 [<c019e0ae>] post_create+0x1b7/0x203 [<c0161116>] vfs_create+0xe7/0xef [<c01614af>] open_namei+0x177/0x5b8 [<c0153e8d>] filp_open+0x23/0x3c [<c02bdaa4>] __cond_resched+0x14/0x39 [<c01b5a5a>] direct_strncpy_from_user+0x3e/0x5d [<c015419f>] sys_open+0x31/0x7d [<c02bf487>] syscall_call+0x7/0xb [<c02b007b>] cookie_v4_check+0xd9/0x3ca Code: 04 8b 2b 0f 85 32 01 00 00 f6 45 00 02 0f 85 28 01 00 00 eb 0b f3 90 8b 06 a9 00 00 08 00 75 f5 f0 0f ba 2e 13 19 c0 85 c0 75 ec <39> 5f 14 75 37 83 7f 08 02 75 31 3b 5d 38 0f 84 e6 00 00 00 68
Created attachment 106732 [details] netconsole logs from x360 that hung Here are the netconsole logs. Looks like there's two oopses somewhat tangled together.
Created attachment 106733 [details] netconsole logs from x440 that hung This is from a different box that has seen the hangs as well. This log is somewhat different, however. No panic, but lots of ext3 errors.
There _is_ a panic in that last report --- but it seems to be from netconsole. Might be worth opening that in a separate bugzilla, it's clearly distinct from the ext3 problems. I've seen this xattr bug reported only once before, against FC3, but adding debug code to the kernel there did not help me get any further with it: https://bugzilla.redhat.com/beta2/show_bug.cgi?id=137237 This is exactly the same footprint. Would it be possible for you to capture a netdump for this, please?
Also, could you please characterise the workload that you're using to recreate this? Thanks.
Does the attachment for comment #3 not have what you're asking for? As far as the workload goes, we're basically doing lots of disk->disk copies, NFS->disk copies, and running a large number of dd processes.
No, the attachment is just an oops log --- I'd like a complete vmcore if possible, please.
Ok, I'll need to read up on how to capture netdumps (sorry for the confusion). The systems need to be reloaded because they've moved on to testing other distro releases, so I'll probably not have this for you till next week.
*** Bug 137237 has been marked as a duplicate of this bug. ***
Andrew Tridgell has reported this under Samba stress loads, and has added an xattr test option to his dbench stress tool. cvs -d :pserver:cvs.org:/cvsroot co dbench and run dbench with the "-x" option. With this, I was able to reproduce the problem within a dozen or so dbench cycles. This greatly reduces the pressure for external help --- with a local reproducer I should be able to get further into the problem.
We've got a candidate fix for this. I'm reviewing that now, and will start testing on it shortly. Fortunately, the original problem case is fairly easy to reproduce, so targeted testing should minimise the risk from this fix.
The initial fix had an easily fixed flaw. The second version exposed a bug elsewhere in the jbd journaling buffer-release mechanism (because it allowed the existing xattr code to make much higher use of the buffer-release code than it was doing before.) The buffer-release code has a relatively simple fix too, and that fix is also needed for a couple of other cases including handling races when allocating on a heavily fragmented filesystem. But we've never seen a report of that case in practice, so in reality it's probably just going to be needed for this xattr case. The buffer-release fix looks obviously correct, but could conceivably trigger other problems, and could also have some performance impact. I believe the risk is low, but it deserves testing. Anyway, the combined fix survived two 50-process xattr-enabled dbench runs on two separate disks in parallel for over 13 hours last night, so the fix is definitely an improvement over the existing code. Reassuringly, dbench throughput was not affected in the slightest by the buffer-release fix.
Created attachment 108362 [details] Fix for mbcache/xattr races This is the fix currently being tested. It is identical to the version tested last night for 14 hours except for one critical fix found during code review, plus one other fix for an error path that cannot be hit except when encountering on-disk corruption.
Created attachment 108363 [details] Fix possible transaction overflow in journal_release_buffer() Using the previous patch, Andrew Tridgell was able to trigger a latent problem in ext3's jbd layer. journal_release_buffer() is used by the xattr code to deal with a race condition --- if a process is looking to share an attribute block but by the time it has acquired journaling rights the attr block has been deleted, it would release the buffer again. But journal_release_buffer() was not safe in all cases. If you take write access to a buffer, then other processes attempting to take write access to the same buffer were allowed to piggy-back on the original process's credits, and would not take their own journal buffer credit. So if the intial process did a journal_release_buffer(), we'd end up with *no* credits outstanding against a buffer being journaled, and that mis-accounting can lead to overflowing the journal. The fix is to always take a buffer credit in do_get_write_access(), unless the buffer is already part of the running transaction *AND* is dirty. The latter part of that condition was not tested for previously. The risk is that if you have many processes modifying the same buffer and not doing journal_release_buffer(), we end up accounting for the same buffer multiple times; so the pessimistic acocunting may cause us to create shorter transactions, impacting performance.
Created attachment 108382 [details] Patch committed to cvs for this bug
*** Bug 143020 has been marked as a duplicate of this bug. ***
IBM, can this issue be closed out?
I'd say yes, it can be closed out. I've not seen this issue for awhile.
Fixed in final RC. There has been a bit more work done on the patch upsteam: the patch here allowed for the xattr code to avoid the use of ext3's journal_release code entirely, which AG has done; and it has been combined with extra patches to allow for in-inode xattr storage. The combined patch set has been merged into the -mm tree for later inclusion in 2.6 mainline.