Red Hat Bugzilla – Bug 214495
Possible mmap dirty blocks lost in kernel-2.6.18-1.2200.fc5 ?
Last modified: 2007-11-30 17:11:48 EST
Description of problem:
Our application is an RDF store running on a loose cluster of x86-64 FC5
machines. It uses mmap() for some potentially very large files. Until recently
everything worked as we expected, but upon upgrading to kernel-2.6.18-1.2200.fc5
we experienced a very frustrating data loss problem. Our investigation is
ongoing, but I think it's time to write this up as a bug.
We have found a way to reproduce this problem on our development system, but
unfortunately it requires importing several gigabytes of RDF (which takes an
hour or two) into our custom software.
The noticeable data loss (in such a large volume of data it is hard to be sure
if this is the only problem) is that our "header" structure in the first block
of an mmap'd file sometimes reverts to an earlier version after we munmap() and
close() the file. We added a timestamp to the header which is updated by our
software, and in our debug logs we see that the timestamp has some value (e.g.
1162918168) and then later when the same file is re-opened its timestamp has
rewound somehow (e.g. to 1162918165) and the other metadata in this header
structure has also reverted to some earlier no longer appropriate values.
Therefore we believe that a dirty block is somehow lost by the kernel, instead
of being written to disk.
Our application does a consistency check when re-opening the file so we notice
this error immediately.
If we revert the kernel on the cluster to kernel-2.6.17-1.2187_FC5 the previous
update, we are unable to reproduce this problem after many hours and many many
gigabytes of data imported. Indeed our production system is currently using
kernel-2.6.17-1.2187_FC5 to store well over a billion RDF triples.
Version-Release number of selected component (if applicable):
We can reliably reproduce this, but unfortunately we have failed to narrow it
down to a test case that can be uploaded to your Bugzilla.
Steps to Reproduce:
1. Reboot to kernel-2.6.18-1.2200.fc5
2. Run our test import of the US TIGER geographic database in RDF form
Somewhere during the import, usually by half way the store complains that
consistency checks have failed. The exact point where the data loss starts to
occur seems to vary somewhat, perhaps depending on memory pressure? Afterwards
logs show that data written to mmap'd memory and then msync'd and munmap'd seems
to have mysteriously vanished.
Import should run to completion, data written to mmap'd memory should appear
identical on disk.
Are there any known problems or suspected problems in the upstream Linux kernel
that could be related? It will probably be very hard to arrange for you to
reproduce this on a machine under Red Hat's control, for two reasons: Firstly
you'd need our software and a set of identical x86-64 machines to run it on, and
secondly you'd need the gigabytes of test data (or we would have to find some
way to generate fictitious data from a pseudo-random seed, verify that it
triggers the bug and then send you the generator and seed).
Since we don't yet have a support contract we can't expect you to go to such
lengths. I will update this bug if we discover anything further, if a future FC5
kernel release does or does not fix our symptoms, or if we are able to reduce
our testcase to something that can be uploaded to Bugzilla.
Maybe there is an existing Linux test suite for this sort of bug? If so please
point me to it.
Today I found a small change (in our code) which seems to eliminate our
symptoms. A much larger test is running and I am awaiting more information from
the author of this part of the code, but I suspect that we were depending on
some semantics for multiple mmap() calls that are not promised by the kernel or
libc manual pages, and the actual Linux behaviour changed in 2.6.18.
If so this bug is invalid and I will close it as such in the next few days.
We had some further unexplained problems, where seemingly data was written via
an mmap() and then an earlier version was found on disk.
Forcing an upgrade to the 188.8.131.52 based 2.6.19-1.2895 from the FC6 updates
eliminated all symptoms on our test cluster. Thus it seems likely that all along
we were seeing another example of the famous ext3 + mmap bug Linus fixed in 184.108.40.206
Can FC5 users get a backport of 220.127.116.11 or at least of Linus' ext3 mmap fix for
Fedora Core 5 ? I will probably try to do my own RPM rebuild of the FC6 source
RPM, but we probably won't be the only ones affected by this, so it make sense
to try to have this as an update for other users too.
FC5 update to upstream kernel 18.104.22.168 is in the works.
*** Bug 227194 has been marked as a duplicate of this bug. ***
*** Bug 211254 has been marked as a duplicate of this bug. ***
Our rebuild of 22.214.171.124 works, we have not seen any data corruption in the six
weeks or so since we upgraded production systems to this version. I have no
reason to assume it's any different for the version Red Hat shipped and no
desire to disrupt production machines by changing to an almost identical kernel.
Hence resolving as fixed by ERRATA