Red Hat Bugzilla – Bug 221619
FC6 reliably hangs with heavy xfs filesystem activity on a LUKS volume
Last modified: 2007-11-30 17:11:52 EST
Description of problem:
The whole system freezes and requires a power cycle when there is heavy disk
activity on an xfs filesystem that is on an encrypted LUKS volume.
Version-Release number of selected component (if applicable):
Force heavy filesystem activity on xfs over LUKS.
Steps to Reproduce:
1. Create and mount an xfs filesystem on an encrypted LUKS volume.
2. Rsync about 50GB of large and small files to the filesystem over the network
on a machine with 512MB RAM.
3. Log in and out several times both on the console and via ssh while the rsync
Eventually the system hangs every time.
Slow response due to heavy disk activity but eventually the rsync finishes and
the system returns to normal.
if there's anything you can do to capture the system state when this happens,
that would be good - either a crashdump, or sysrq-type info (m for memory, t for
Also can you say whether this is specific to xfs? Might be worth testing on
ext3 as well, just for comparison. If it looks xfs specific, maybe I can get
the sgi guys to take a look.
There does not appear to be any way to capture additional state. The system
just halts and refuses to accept input. A power cycle is necessary to do
anything with it.
I tried reiserfs a while back and that fails in a different way. It just goes
into hours and hours of constant disk activity as if it is in an infinite loop.
There is something fundamental failing, and the way the filesystems react to
that is a different problem.
Hmm so the sysrq key doesn't work either? Pity... need more info to see what's
going on. Any messages on the console?
Have you tried it on ext3? It's Red Hat's favorite filesystem after all. :)
Nothing gets written to the screen. The screen doesn't change, so it looks the
same as it does when X windows is running, but there is no response to any
attempt at input and no disk activity.
Never tried it on ext3; that filesystem is too slow when not freshly created to
meet my needs.
Ok; I guess my only other suggestions (other than finding me some time to
reproduce it!) would be to try reproducing it on a text console rather than X,
so that you can see any messages, or perhaps set up a serial console to capture
messages, and try the sysrq key from either of those to gather system state or
initiate a crashdump...
Without X, there would probably have to be something else with a large working
set in memory, so that may take some doing to reproduce the problem.
I don't have the same system configuration available anymore since I gave up on
FC6 and went back to FC5.
Reporter cannot test fixes because he has gone back to FC5.
Current FC6 kernel is 2.6.20-1.2933.fc6...
Reporter will test FC6 if a fix becomes available.
It is possible that this bug is related to bug 221621. In that case, a likely
explanation is that the system hangs in certain situations when there is a queue
of pages to be flushed to disk. This can happen when there is a large amount of
disk activity, or when a large proportion of main memory is in use.
First you need to test kernel 2.6.20-1.2933.fc6, which is going into testing
tomorrow. There is no way to know in advance if the 6000+ changes that
went into that kernel fixed the problem unless you test.
I will not always be able to do this when a fix has not been attempted,
especially on such short notice, so please plan ahead.
I have reinstalled FC6 and updated to the latest kernel, which yum claims is
2.6.20-1.2925.fc6. Please advise how to get the 2933 version.
2933 is in updates-testing:
yum --enablerepo=updates-testing install kernel
You might have better luck using RPM to install it manually, though.
Thanks. Updating with yum worked fine.
This bug had been very easy to reproduce. Now with 1 1/2 days of testing, I
cannot make it fail. It appears to be fixed as of 2.6.20-1.2933.fc6. Good job.