Bug 870246

Summary: LVM RAID: Images that are reintroduced into an array are not synced
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.4CC: agk, dwysocha, heinzm, jbrassow, mcsontos, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-340.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 06:53:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2012-10-25 22:21:40 UTC
Description of problem:
This test case worked in RHEL6.3

# ON RHEL6.3 RPMS
./split_image -l /home/msp/cmarthal/work/sts/sts-root -o taft-01 -r /usr/tests/sts-rhel6.4 -e split_w_tracking_io_merge

SCENARIO - [split_w_tracking_io_merge]
Create a 3-way raid1 with fs data, verify data, split image with tracking, change data on raid vol, merge split image data back, verify origin data
taft-01: lvcreate --type raid1 -m 2 -n split_tracking -L 1G split_image
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 38.51% )
   0/1 mirror(s) are fully synced: ( 82.21% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Placing an ext filesystem on raid1 volume
mke2fs 1.41.12 (17-May-2010)
Mounting raid1 volume

Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31142 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31142 -v

Issuing a sync to force data to disk
splitting off leg from raid with tracking...
taft-01: lvconvert --splitmirrors 1 --trackchanges split_image/split_tracking

+++ Mounting and verifying split image data +++
mount: block device /dev/mapper/split_image-split_tracking_rimage_2 is write-protected, mounting read-only
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingA.31142 -v

Writing new data to the raid and then merging back the split off image
Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingB.31142 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingB.31142 -v
Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingC.31142 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingC.31142 -v
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31142 -v

Issuing a sync to force data to disk
Merge split off image split_image/split_tracking_rimage_2 back into the raid
lvconvert --merge split_image/split_tracking_rimage_2
Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )
Issuing a sync to force data to disk
AGAIN, splitting off leg from raid with tracking...
taft-01: lvconvert --splitmirrors 1 --trackchanges split_image/split_tracking

+++ Mounting and verifying split image data +++
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingA.31142 -v
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingB.31142 -v
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingC.31142 -v

Merge split off image split_image/split_tracking_rimage_2 back so it can be deleted
lvconvert --merge split_image/split_tracking_rimage_2
Deactivating mirror split_tracking... and removing



# ON RHEL6.4 RPMS
SCENARIO - [split_w_tracking_io_merge]
Create a 3-way raid1 with fs data, verify data, split image with tracking, change data on raid vol, merge split image data back, verify origin data
taft-02: lvcreate --type raid1 -m 2 -n split_tracking -L 1G split_image
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 35.21% )
   0/1 mirror(s) are fully synced: ( 76.63% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Placing an ext filesystem on raid1 volume
mke2fs 1.41.12 (17-May-2010)
Mounting raid1 volume

Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31772 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31772 -v

Issuing a sync to force data to disk
splitting off leg from raid with tracking...
taft-02: lvconvert --splitmirrors 1 --trackchanges split_image/split_tracking

+++ Mounting and verifying split image data +++
mount: block device /dev/mapper/split_image-split_tracking_rimage_2 is write-protected, mounting read-only
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingA.31772 -v

Writing new data to the raid and then merging back the split off image
Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingB.31772 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingB.31772 -v
Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingC.31772 -n 500
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingC.31772 -v
Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.31772 -v

Issuing a sync to force data to disk
Merge split off image split_image/split_tracking_rimage_2 back into the raid
lvconvert --merge split_image/split_tracking_rimage_2
Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )
Issuing a sync to force data to disk
AGAIN, splitting off leg from raid with tracking...
taft-02: lvconvert --splitmirrors 1 --trackchanges split_image/split_tracking

+++ Mounting and verifying split image data +++
mount: block device /dev/mapper/split_image-split_tracking_rimage_2 is write-protected, mounting read-only
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingA.31772 -v
Checking files on /mnt/split_tracking_rimage_2
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking_rimage_2 -f /tmp/split_trackingB.31772 -v
checkit starting with:
VERIFY
Verify XIOR Stream: /tmp/split_trackingB.31772
Working dir:        /mnt/split_tracking_rimage_2
Can not stat nvehtgqcgkswuwcdtwepckvumwmsatlnppxmyqqhbj: No such file or directory
checkit verify failed

** NONE OF THE NEW DATA EXISTS **

Version-Release number of selected component (if applicable):
2.6.32-330.el6.x86_64

lvm2-2.02.98-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
lvm2-libs-2.02.98-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
lvm2-cluster-2.02.98-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
udev-147-2.43.el6    BUILT: Thu Oct 11 05:59:38 CDT 2012
device-mapper-1.02.77-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
device-mapper-libs-1.02.77-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
device-mapper-event-1.02.77-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
device-mapper-event-libs-1.02.77-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012
cmirror-2.02.98-2.el6    BUILT: Tue Oct 16 05:15:59 CDT 2012


How reproducible:
Everytime

Comment 1 Jonathan Earl Brassow 2012-11-01 22:00:46 UTC
Upstream kernel doesn't have this problem.  It could be something that has been recently pulled in...

Comment 2 Jonathan Earl Brassow 2012-11-01 22:24:45 UTC
The problem is worse that just 'splitmirrors' it affects all RAID types.  If a RAID5 LV has a transient failure, when the device comes back it will not be sync'ed either.  This bug applies to any re-introduction of an image into a RAID LV.

These problems are not in the upstream kernel, which means that the problem is in the generic code - not the personality code.

Comment 3 Jonathan Earl Brassow 2012-11-02 21:30:54 UTC
The difference between the way the upstream kernel and the rhel6.4 kernel are handling the situation is that the upstream kernel is writing the array superblocks after a failure is detected - the rhel kernel is not.  This means that when the transiently failed device is reintroduced, it is not recorded as having failed - thus, no recovery required.

I am not yet sure why the superblocks are not being written.

Comment 4 RHEL Program Management 2012-11-02 21:51:09 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 6 Marian Csontos 2012-11-08 18:41:11 UTC
*** Bug 869003 has been marked as a duplicate of this bug. ***

Comment 7 Jarod Wilson 2012-11-12 18:21:44 UTC
Patch(es) available on kernel-2.6.32-340.el6

Comment 9 Marian Csontos 2012-11-13 08:58:51 UTC
Thanks, that fixed the problem. Running more tests using the build.

Comment 11 Corey Marthaler 2012-12-06 20:47:50 UTC
Marking verified in the latest kernel.

2.6.32-343.el6.x86_64

lvm2-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
lvm2-libs-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
lvm2-cluster-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
udev-147-2.43.el6    BUILT: Thu Oct 11 05:59:38 CDT 2012
device-mapper-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-libs-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-event-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-event-libs-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
cmirror-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012


SCENARIO - [split_w_tracking_io_merge]
Create a 3-way raid1 with fs data, verify data, split image with tracking, change data on raid vol, merge split image data back, verify origin data
taft-01: lvcreate --type raid1 -m 2 -n split_tracking -L 1G split_image
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 30.63% )
   0/1 mirror(s) are fully synced: ( 68.17% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Placing an ext filesystem on raid1 volume
mke2fs 1.41.12 (17-May-2010)
Mounting raid1 volume

Writing files to /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.14962 -n 500
checkit starting with:
CREATE
Num files:          500
Random Seed:        3592
Verify XIOR Stream: /tmp/split_trackingA.14962
Working dir:        /mnt/split_tracking

Checking files on /mnt/split_tracking
/usr/tests/sts-rhel6.4/bin/checkit -w /mnt/split_tracking -f /tmp/split_trackingA.14962 -v
checkit starting with:
VERIFY
Verify XIOR Stream: /tmp/split_trackingA.14962
Working dir:        /mnt/split_tracking


Issuing a sync to force data to disk
splitting off leg from raid with tracking...
taft-01: lvconvert --splitmirrors 1 --trackchanges split_image/split_tracking

Comment 13 errata-xmlrpc 2013-02-21 06:53:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html