214495 – Possible mmap dirty blocks lost in kernel-2.6.18-1.2200.fc5 ?

Bug 214495 - Possible mmap dirty blocks lost in kernel-2.6.18-1.2200.fc5 ?

Summary: Possible mmap dirty blocks lost in kernel-2.6.18-1.2200.fc5 ?

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	5
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	211254 227194 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-11-07 21:07 UTC by Nick Lamb
Modified:	2007-11-30 22:11 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2007-03-22 13:50:28 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nick Lamb 2006-11-07 21:07:32 UTC

Description of problem:

Our application is an RDF store running on a loose cluster of x86-64 FC5
machines. It uses mmap() for some potentially very large files. Until recently
everything worked as we expected, but upon upgrading to kernel-2.6.18-1.2200.fc5
we experienced a very frustrating data loss problem. Our investigation is
ongoing, but I think it's time to write this up as a bug.

We have found a way to reproduce this problem on our development system, but
unfortunately it requires importing several gigabytes of RDF (which takes an
hour or two) into our custom software.

The noticeable data loss (in such a large volume of data it is hard to be sure
if this is the only problem) is that our "header" structure in the first block
of an mmap'd file sometimes reverts to an earlier version after we munmap() and
close() the file. We added a timestamp to the header which is updated by our
software, and in our debug logs we see that the timestamp has some value (e.g.
1162918168) and then later when the same file is re-opened its timestamp has
rewound somehow (e.g. to 1162918165) and the other metadata in this header
structure has also reverted to some earlier no longer appropriate values.
Therefore we believe that a dirty block is somehow lost by the kernel, instead
of being written to disk.

Our application does a consistency check when re-opening the file so we notice
this error immediately.

If we revert the kernel on the cluster to kernel-2.6.17-1.2187_FC5 the previous
update, we are unable to reproduce this problem after many hours and many many
gigabytes of data imported. Indeed our production system is currently using
kernel-2.6.17-1.2187_FC5 to store well over a billion RDF triples.


Version-Release number of selected component (if applicable):
kernel-2.6.18-1.2200.fc5

How reproducible:
We can reliably reproduce this, but unfortunately we have failed to narrow it
down to a test case that can be uploaded to your Bugzilla.

Steps to Reproduce:
1. Reboot to kernel-2.6.18-1.2200.fc5
2. Run our test import of the US TIGER geographic database in RDF form
  
Actual results:
Somewhere during the import, usually by half way the store complains that
consistency checks have failed. The exact point where the data loss starts to
occur seems to vary somewhat, perhaps depending on memory pressure? Afterwards
logs show that data written to mmap'd memory and then msync'd and munmap'd seems
to have mysteriously vanished.

Expected results:
Import should run to completion, data written to mmap'd memory should appear
identical on disk.

Additional info:
Are there any known problems or suspected problems in the upstream Linux kernel
that could be related? It will probably be very hard to arrange for you to
reproduce this on a machine under Red Hat's control, for two reasons: Firstly
you'd need our software and a set of identical x86-64 machines to run it on, and
secondly you'd need the gigabytes of test data (or we would have to find some
way to generate fictitious data from a pseudo-random seed, verify that it
triggers the bug and then send you the generator and seed).

Since we don't yet have a support contract we can't expect you to go to such
lengths. I will update this bug if we discover anything further, if a future FC5
kernel release does or does not fix our symptoms, or if we are able to reduce
our testcase to something that can be uploaded to Bugzilla.

Maybe there is an existing Linux test suite for this sort of bug? If so please
point me to it.

Comment 1 Nick Lamb 2006-11-08 14:37:51 UTC

Today I found a small change (in our code) which seems to eliminate our
symptoms. A much larger test is running and I am awaiting more information from
the author of this part of the code, but I suspect that we were depending on
some semantics for multiple mmap() calls that are not promised by the kernel or
libc manual pages, and the actual Linux behaviour changed in 2.6.18.

If so this bug is invalid and I will close it as such in the next few days.

Comment 2 Nick Lamb 2007-02-04 23:21:55 UTC

We had some further unexplained problems, where seemingly data was written via
an mmap() and then an earlier version was found on disk.

Forcing an upgrade to the 2.6.19.2 based 2.6.19-1.2895 from the FC6 updates
eliminated all symptoms on our test cluster. Thus it seems likely that all along
we were seeing another example of the famous ext3 + mmap bug Linus fixed in 2.6.19.2

Can FC5 users get a backport of 2.6.19.2 or at least of Linus' ext3 mmap fix for
Fedora Core 5 ? I will probably try to do my own RPM rebuild of the FC6 source
RPM, but we probably won't be the only ones affected by this, so it make sense
to try to have this as an update for other users too.

Comment 3 Chuck Ebbert 2007-02-05 17:09:53 UTC

FC5 update to upstream kernel 2.6.19.3 is in the works.

Comment 4 Chuck Ebbert 2007-02-05 17:11:21 UTC

*** Bug 227194 has been marked as a duplicate of this bug. ***

Comment 5 Axel Thimm 2007-02-05 17:29:35 UTC

*** Bug 211254 has been marked as a duplicate of this bug. ***

Comment 6 Nick Lamb 2007-03-22 13:50:28 UTC

Our rebuild of 2.6.19.2 works, we have not seen any data corruption in the six
weeks or so since we upgraded production systems to this version. I have no
reason to assume it's any different for the version Red Hat shipped and no
desire to disrupt production machines by changing to an almost identical kernel.
Hence resolving as fixed by ERRATA

Note You need to log in before you can comment on or make changes to this bug.