Bug 621301

Summary: Data corruption on primary device failure in a cluster mirror (cmirror)
Product: Red Hat Enterprise Linux 6 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Marthaler <cmarthal>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: agk, dwysocha, heinzm, jbrassow, joe.thornber, mbroz, prajnoha, prockai, syeghiay
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.72-4.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-10 21:08:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Jonathan Earl Brassow 2010-08-04 17:51:14 UTC
Running the test helter_skelter/kill_primary_synced_2_legs on cluster mirrors elicits an easily reproducible data corruption bug.

When the primary device is removed during the repair operation, the linear device that remains does not contain a valid file system - many points of meta-data and data corruption.

Comment 2 Jonathan Earl Brassow 2010-08-04 18:13:27 UTC
After a few days of debugging, it has boiled down to a misunderstanding of the return value of 'dm_bit'.  'dm_bit' is only ever used as a boolean operation within LVM, but it can return a range of values.  If the bit is set, a power of 2 is returned.  If the bit is unset, 0 is returned.

'log_test_bit' (a function in the cluster mirror log daemon code) has switched to using the dm bit operations in rhel6.  There are two places in the daemon code where 'log_test_bit' is not used merely as a boolean, but rather the return value is used as the return value for the log functions 'is_clean' and 'in_sync' - having assumed that 'dm_bit' was returning 0 or 1 only.

One place the 'in_sync' function is utilized is in 'dm_rh_get_state' - a function that informs the mirroring code how to treat I/O and which devices to read/write from.  'dm_rh_get_state' was checking if the return value of 'in_sync' was 1 to determine if the region was DM_RH_CLEAN.  Since 'dm_bit' (and by extension 'log_test_bit' and 'in_sync') was returning powers of 2, DM_RH_CLEAN was rarely being reported as it should have been.  Thinking the region was out-of-sync, the mirroring code would write only to the primary device.  When the primary device was failed, all of those writes were lost - leaving the entire mirror corrupted.

After much debugging, the patch is simple (and in userspace) :(
 static int log_test_bit(dm_bitset_t bs, int bit)
 {
-       return dm_bit(bs, bit);
+       return dm_bit(bs, bit) ? 1 : 0;
 }

Comment 4 Corey Marthaler 2010-08-13 21:40:44 UTC
The helter_skelter test case kill_primary_synced_2_legs runs without any corruption issues. Marking this bug verified in the latest build.

2.6.32-59.1.el6.x86_64

lvm2-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
lvm2-libs-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
lvm2-cluster-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
udev-147-2.22.el6    BUILT: Fri Jul 23 07:21:33 CDT 2010
device-mapper-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-libs-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-event-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-event-libs-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
cmirror-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010

Comment 5 releng-rhel@redhat.com 2010-11-10 21:08:38 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.