If you read from the snapshot at the same time as the origin is being written
to, the data you get back may be contaminated by what is being written to the
Neither the origin nor snapshot are actually corrupted on disk by this bug.
Rather it is read() that can return the wrong blocks of data, so anything
reading the snapshot device (such as a backup program) may see corruption.
The correct read path is fast compared to the time taken by the code in the path
that can race against it - I think this explains why occurrences of the problem
are fortunately rare and hard to reproduce. (If you re-read the corrupt data,
it'll always be correct - the race can't happen twice on the same data.)
This might be partly responsible for bug 174742 and others.
* FIXME: this read path scares me because we
* always use the origin when we have a pending
* exception. However I can't think of a
* situation where this is wrong - ejt.
/* Do reads */
/* See if it it has been remapped */
e = lookup_exception(&s->complete, chunk);
remap_exception(s, e, bio);
bio->bi_bdev = s->origin->bdev;
As the FIXME suggests, there are situation when
what it does is wrong.
A write to origin creates a pe.
A read from snapshot gets mapped to the origin.
The pe processing is completed and the write to the origin completes.
The snapshot read happens but sees what was just written to the origin.
A new dm target, rwsplit, was written to make it straightforward to reproduce
This target sends reads to one device and writes to another. The test script
attached uses this to suspend reads to a device whilst allowing writes to
proceed. The script takes 3 copies of the snapshot device. They should all be
identical as nothing is writing to the snapshot. But the middle one gets
contaminated with the data written to the origin. This happens every time on my
A fix will have to add code to track the I/O through the snapshot target -
either directly (as elsewhere in dm targets) or possibly by extending the
pending exception code.
Created attachment 122288 [details]
snapshot flaw demonstration script
Clearing flags for 4.6...
re-asserting pm-ack for 4.7
Mikulas is working on a solution. Unclear as yet whether or not this will have
to be kABI-breaking.
Tom Coughlan: Fix for upstream exists (and doesn't break kABI). I don't know if
Alasdair wants to backport it to 4.7 so soon. Probably defer to 4.8.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Updating PM score.
Created attachment 326893 [details]
A patch for this bug
A patch from this bug. Backported from upstream and RHEL 5.
Created attachment 326898 [details]
An updated patch.
The previous patch was wrong, it was only half of the patch. This is the updated correct patch.
*** Bug 444049 has been marked as a duplicate of this bug. ***
Committed in 81.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.