Red Hat Bugzilla – Bug 138951
[RHEL4 beta2] System occasionally hangs while under heavy load on IBM x360
Last modified: 2007-11-30 17:07:14 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.3; Linux) (KHTML,
Description of problem:
Using IBM's "pounder" test (sorry its not distributable), I've been
seeing hangs w/ RHEL4 Beta2. So far it looks like the hangs have
occurred only on an IBM x360 and possibly an IBM x440 (although a
different x440 passed without a problem).
Looking over vmstat/slabinfo/meminfo logs from the test run, I don't
seen any lowmem exhaustion (lowmem sticks around 2megs).
The hangs are somewhat odd. The system responds to pings but is not
ssh'able. In atleast one case the keyboard numlock worked and the
mouse still would move in X, however the GDM application would not
respond to keypresses.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL4 beta2
2. Run pounder test case
3. Wait about 2-3 hours
Actual Results: System hangs, but is pingable. X even responds to
mouse movement, but applications don't seem to respond (GDM doesn't
get keypresses and displays time of the hang, ssh doesn't work).
Expected Results: System continues running w/o issue.
I'm working to narrow down the issue to a single test I can
Created attachment 106613 [details]
conosle panic image
Yes, I am the king of annoying bug reports! Here is a camera capture of the
panic seen on the system. I'm going to try to get a netconsole dump to lessen
More usable text based output captured from the service processor:
esi: c97fb6fc edi: 00000000 ebp: f7e8a600 esp: c349ade4
ds: 007b es: 007b ss: 0068
Process cp (pid: 27211, threadinfo=c349a000 task=f5115770)
Stack: cc115828 00000000 c97fb6fc 00000000 ed3390f8 f88ec9df c97fb6fc
00000000 cc115828 ffffff86 00000000 00000007 f4e0f8f4 ce664000
f88ec755 ce663000 00000000 ed3390b4 ed339028 00000007 00000001
[<f88ec9df>] ext3_xattr_set_handle2+0x23d/0x417 [ext3]
[<f88ec755>] ext3_xattr_set_handle+0x6db/0x728 [ext3]
[<f88ecc03>] ext3_xattr_set+0x4a/0x83 [ext3]
[<f88ee132>] ext3_xattr_security_set+0x3c/0x83 [ext3]
Code: 04 8b 2b 0f 85 32 01 00 00 f6 45 00 02 0f 85 28 01 00 00 eb 0b
f3 90 8b 06
a9 00 00 08 00 75 f5 f0 0f ba 2e 13 19 c0 85 c0 75 ec <39> 5f 14 75
37 83 7f 08
02 75 31 3b 5d 38 0f 84 e6 00 00 00 68
Created attachment 106732 [details]
netconsole logs from x360 that hung
Here are the netconsole logs. Looks like there's two oopses somewhat tangled
Created attachment 106733 [details]
netconsole logs from x440 that hung
This is from a different box that has seen the hangs as well. This log is
somewhat different, however. No panic, but lots of ext3 errors.
There _is_ a panic in that last report --- but it seems to be from
netconsole. Might be worth opening that in a separate bugzilla, it's
clearly distinct from the ext3 problems.
I've seen this xattr bug reported only once before, against FC3, but
adding debug code to the kernel there did not help me get any further
This is exactly the same footprint.
Would it be possible for you to capture a netdump for this, please?
Also, could you please characterise the workload that you're using to
recreate this? Thanks.
Does the attachment for comment #3 not have what you're asking for?
As far as the workload goes, we're basically doing lots of disk->disk
copies, NFS->disk copies, and running a large number of dd processes.
No, the attachment is just an oops log --- I'd like a complete vmcore if
Ok, I'll need to read up on how to capture netdumps (sorry for the
confusion). The systems need to be reloaded because they've moved on
to testing other distro releases, so I'll probably not have this for
you till next week.
*** Bug 137237 has been marked as a duplicate of this bug. ***
Andrew Tridgell has reported this under Samba stress loads, and has
added an xattr test option to his dbench stress tool.
cvs -d :pserver:firstname.lastname@example.org:/cvsroot co dbench
and run dbench with the "-x" option.
With this, I was able to reproduce the problem within a dozen or so
dbench cycles. This greatly reduces the pressure for external help
--- with a local reproducer I should be able to get further into the
We've got a candidate fix for this. I'm reviewing that now, and will start
testing on it shortly.
Fortunately, the original problem case is fairly easy to reproduce, so targeted
testing should minimise the risk from this fix.
The initial fix had an easily fixed flaw.
The second version exposed a bug elsewhere in the jbd journaling
buffer-release mechanism (because it allowed the existing xattr code
to make much higher use of the buffer-release code than it was doing
The buffer-release code has a relatively simple fix too, and that fix
is also needed for a couple of other cases including handling races
when allocating on a heavily fragmented filesystem. But we've never
seen a report of that case in practice, so in reality it's probably
just going to be needed for this xattr case. The buffer-release fix
looks obviously correct, but could conceivably trigger other problems,
and could also have some performance impact. I believe the risk is
low, but it deserves testing.
Anyway, the combined fix survived two 50-process xattr-enabled dbench
runs on two separate disks in parallel for over 13 hours last night,
so the fix is definitely an improvement over the existing code.
Reassuringly, dbench throughput was not affected in the slightest by
the buffer-release fix.
Created attachment 108362 [details]
Fix for mbcache/xattr races
This is the fix currently being tested. It is identical to the version tested
last night for 14 hours except for one critical fix found during code review,
plus one other fix for an error path that cannot be hit except when
encountering on-disk corruption.
Created attachment 108363 [details]
Fix possible transaction overflow in journal_release_buffer()
Using the previous patch, Andrew Tridgell was able to trigger a latent problem
in ext3's jbd layer.
journal_release_buffer() is used by the xattr code to deal with a race
condition --- if a process is looking to share an attribute block but by the
time it has acquired journaling rights the attr block has been deleted, it
would release the buffer again.
But journal_release_buffer() was not safe in all cases. If you take write
access to a buffer, then other processes attempting to take write access to the
same buffer were allowed to piggy-back on the original process's credits, and
would not take their own journal buffer credit. So if the intial process did a
journal_release_buffer(), we'd end up with *no* credits outstanding against a
buffer being journaled, and that mis-accounting can lead to overflowing the
The fix is to always take a buffer credit in do_get_write_access(), unless the
buffer is already part of the running transaction *AND* is dirty. The latter
part of that condition was not tested for previously.
The risk is that if you have many processes modifying the same buffer and not
doing journal_release_buffer(), we end up accounting for the same buffer
multiple times; so the pessimistic acocunting may cause us to create shorter
transactions, impacting performance.
Created attachment 108382 [details]
Patch committed to cvs for this bug
*** Bug 143020 has been marked as a duplicate of this bug. ***
IBM, can this issue be closed out?
I'd say yes, it can be closed out. I've not seen this issue for awhile.
Fixed in final RC.
There has been a bit more work done on the patch upsteam: the patch
here allowed for the xattr code to avoid the use of ext3's
journal_release code entirely, which AG has done; and it has been
combined with extra patches to allow for in-inode xattr storage. The
combined patch set has been merged into the -mm tree for later
inclusion in 2.6 mainline.