Bug 175830 - dm-snap.c: Data read from snapshot may be corrupt if origin is being written to simultaneously
dm-snap.c: Data read from snapshot may be corrupt if origin is being written ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.3
All Linux
medium Severity high
: ---
: ---
Assigned To: Mikulas Patocka
Brian Brock
:
: 444049 (view as bug list)
Depends On:
Blocks: 176344 430698 459337 461304
  Show dependency treegraph
 
Reported: 2005-12-15 11:18 EST by Alasdair Kergon
Modified: 2009-05-18 15:32 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:32:26 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
snapshot flaw demonstration script (1.34 KB, text/plain)
2005-12-15 11:18 EST, Alasdair Kergon
no flags Details
update for detection script (3.90 KB, text/plain)
2005-12-15 17:59 EST, Jonathan Earl Brassow
no flags Details
A patch for this bug (6.06 KB, patch)
2008-12-14 19:57 EST, Mikulas Patocka
no flags Details | Diff
An updated patch. (7.07 KB, patch)
2008-12-14 21:15 EST, Mikulas Patocka
no flags Details | Diff

  None (edit)
Description Alasdair Kergon 2005-12-15 11:18:52 EST
If you read from the snapshot at the same time as the origin is being written
to, the data you get back may be contaminated by what is being written to the
origin.

Neither the origin nor snapshot are actually corrupted on disk by this bug.
Rather it is read() that can return the wrong blocks of data, so anything
reading the snapshot device (such as a backup program) may see corruption.

The correct read path is fast compared to the time taken by the code in the path
that can race against it - I think this explains why occurrences of the problem
are fortunately rare and hard to reproduce.  (If you re-read the corrupt data,
it'll always be correct - the race can't happen twice on the same data.)

This might be partly responsible for bug 174742 and others.



Explanation
-----------
dm-snap.c:

                /*
                 * FIXME: this read path scares me because we
                 * always use the origin when we have a pending
                 * exception.  However I can't think of a
                 * situation where this is wrong - ejt.
                 */
                                                                                
                /* Do reads */
                down_read(&s->lock);
                                                                                
                /* See if it it has been remapped */
                e = lookup_exception(&s->complete, chunk);
                if (e)
                        remap_exception(s, e, bio);
                else
                        bio->bi_bdev = s->origin->bdev;
                                                                                
                up_read(&s->lock);
                                                                                

As the FIXME suggests, there are situation when                                
              what it does is wrong.

Consider:                                         
  A write to origin creates a pe.
  A read from snapshot gets mapped to the origin.
  The pe processing is completed and the write to the origin completes.
  The snapshot read happens but sees what was just written to the origin.


A new dm target, rwsplit, was written to make it straightforward to reproduce
this scenario.  

http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-rwsplit.patch

This target sends reads to one device and writes to another.  The test script
attached uses this to suspend reads to a device whilst allowing writes to
proceed.  The script takes 3 copies of the snapshot device.  They should all be
identical as nothing is writing to the snapshot.  But the middle one gets
contaminated with the data written to the origin.  This happens every time on my
test machine.


A fix will have to add code to track the I/O through the snapshot target -
either directly (as elsewhere in dm targets) or possibly by extending the
pending exception code.
Comment 1 Alasdair Kergon 2005-12-15 11:18:52 EST
Created attachment 122288 [details]
snapshot flaw demonstration script
Comment 19 Linda Wang 2007-01-25 09:57:16 EST
Clearing flags for 4.6...
Comment 22 Rob Kenna 2007-08-23 14:29:20 EDT
re-asserting pm-ack for 4.7
Comment 23 Alasdair Kergon 2008-02-28 18:11:44 EST
Mikulas is working on a solution.  Unclear as yet whether or not this will have
to be kABI-breaking.
Comment 25 Mikulas Patocka 2008-03-25 17:00:00 EDT
Tom Coughlan: Fix for upstream exists (and doesn't break kABI). I don't know if
Alasdair wants to backport it to 4.7 so soon. Probably defer to 4.8.
Comment 26 RHEL Product and Program Management 2008-03-26 10:39:12 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 28 RHEL Product and Program Management 2008-09-03 09:15:01 EDT
Updating PM score.
Comment 29 Mikulas Patocka 2008-12-14 19:57:27 EST
Created attachment 326893 [details]
A patch for this bug

A patch from this bug. Backported from upstream and RHEL 5.
Comment 30 Mikulas Patocka 2008-12-14 21:15:52 EST
Created attachment 326898 [details]
An updated patch.

The previous patch was wrong, it was only half of the patch. This is the updated correct patch.
Comment 31 Mikulas Patocka 2009-01-08 04:47:13 EST
*** Bug 444049 has been marked as a duplicate of this bug. ***
Comment 32 Vivek Goyal 2009-02-12 10:34:45 EST
Committed in 81.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 36 errata-xmlrpc 2009-05-18 15:32:26 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.