Bug 817130

Summary: Add warning when using snapshots of mirrors
Product: Red Hat Enterprise Linux 6 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: agk, cmarthal, dwysocha, heinzm, jbrassow, mbroz, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.95-9.el6 Doc Type: Bug Fix
Doc Text:
LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type. The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots. As a result, users who employ snapshots of mirrored volumes are encouraged to use the "raid1" segment type rather than the "mirror" segment type. Users who continue to use the "mirror" segment type as the origin LV for snapshots should plan for the possibility of the following disruptions. When a snapshot is created or resized, it forces I/O through the underlying origin. The operation will not complete until this occurs. If a device failure occurs to a mirrored logical volume (of "mirror" segment type) that is the origin of the snapshot being created or resized, it will delay I/O until it is reconfigured. The mirror cannot be reconfigured until the snapshot operation completes, but the snapshot operation cannot complete unless the mirror releases the I/O. Again, the problem can manifest itself when the mirror suffers a failure simultaneously with a snapshot creation or resize. There is no current solution to this problem beyond converting the mirror from the "mirror" segment type to the "raid1" segment type. In order to convert an existing mirror from the "mirror" segment type to the "raid1" segment type, perform the following action: ~> lvconvert --type raid1 <VG>/<mirrored LV> This operation can only be undone via 'vgcfgrestore'. With current version of LVM2, if the "mirror" segment type is used to create a new mirror LV, a warning message is issued to warn a user about possible problems and it suggests using the "raid1" segment type instead.
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 15:03:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jonathan Earl Brassow 2012-04-27 19:07:52 UTC
Snapshots of mirrors are not supported.  They will never be supported for the 'mirror' segment type.  We should disallow the creation in the code.

Snapshots of LVM RAID segment types will be supported. (Be careful when implementing the check not just to simply check for lv_is_mirrored(), as this may inadvertently disallow "raid1" segment type.)

Comment 1 Jonathan Earl Brassow 2012-05-01 19:57:51 UTC
Upstream code has been modified to disallow snapshots of mirrors.  This will not affect existing snapshots of mirrors, but users will not be allowed to create new ones.

commit 8322deefee572850d487d4f8582555d5235474ee
Author: Jonathan Earl Brassow <jbrassow>
Date:   Tue May 1 19:21:24 2012 +0000

    Disallow snapshots of mirror segment types.
    
    Snapshots of RAID logical volumes are allowed (including "raid1").  However,
    snapshots of "mirror" logical volumes has been disallowed due to unsolvable
    issues inherent to the design.  The fact that mirroring (dm-raid1.c) must
    stop all I/O as the result of a failure and wait for userspace intervention
    can lead to a circular dependency if userspace is simultaneously waiting for
    snapshots (on mirrors) to make an I/O update before proceeding.
    
    Various snapshot on mirror tests have been removed as a result.

Comment 5 Peter Rajnoha 2012-05-02 12:08:02 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Snapshots of mirror logical volumes are not supported with the exception of volumes using RAID mirror segment type where snapshots are still possible.

Comment 8 Corey Marthaler 2012-05-02 16:26:20 UTC
Fix verified in the latest rpms.


2.6.32-269.el6.x86_64
lvm2-2.02.95-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
lvm2-libs-2.02.95-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
lvm2-cluster-2.02.95-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
udev-147-2.41.el6    BUILT: Thu Mar  1 13:01:08 CST 2012
device-mapper-1.02.74-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
device-mapper-libs-1.02.74-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
device-mapper-event-1.02.74-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
device-mapper-event-libs-1.02.74-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012
cmirror-2.02.95-7.el6    BUILT: Wed May  2 05:14:03 CDT 2012


[root@taft-01 ~]# lvs -a -o +devices
 LV                VG    Attr     LSize   Log         Copy%  Devices
 mirror            taft  mwi-a-m- 100.00m mirror_mlog 100.00 mirror_mimage_0(0),mirror_mimage_1(0),mirror_mimage_2(0)
 [mirror_mimage_0] taft  iwi-aom- 100.00m                    /dev/sdb1(0)
 [mirror_mimage_1] taft  iwi-aom- 100.00m                    /dev/sdc1(0)
 [mirror_mimage_2] taft  iwi-aom- 100.00m                    /dev/sdd1(0)
 [mirror_mlog]     taft  lwi-aom-   4.00m                    /dev/sdh1(0)

[root@taft-01 ~]# lvcreate -s taft/mirror -n snap -L 50M
 Rounding up size to full physical extent 52.00 MiB
 Snapshots of "mirror" segment types are not supported

Comment 9 Alasdair Kergon 2012-05-03 09:36:18 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1,3 @@
+[STILL BEING EDITED]
+
 Snapshots of mirror logical volumes are not supported with the exception of volumes using RAID mirror segment type where snapshots are still possible.

Comment 10 Alasdair Kergon 2012-05-03 10:39:48 UTC
See also:
https://www.redhat.com/archives/dm-devel/2010-August/msg00145.html


    * From: Mikulas Patocka <mpatocka redhat com>
    * To: Alasdair G Kergon <agk redhat com>
    * Cc: dm-devel redhat com
    * Subject: [dm-devel] snapshots of mirror problems
    * Date: Wed, 18 Aug 2010 10:50:34 -0400 (EDT)

The problem is this:

Creating the first snapshot:
----------------------------

- preloads -cow, -real devices and origin and snapshot targets

- suspends the underlying lv (mirror in this case) without 
DM_SUSPEND_NOFLUSH_FLAG and with DM_SUSPEND_LOCKFS_FLAG. This waits for 
all bios to drain and calls a filesystem driver to bring it to consistent 
state.

- swap table with origin targets

- resumes the underlying lv, the snapshot target and the origin target

Handing a mirror failure:
-------------------------

- preload the new table with linear volume or a mirror with reduced number 
of legs or a mirror with new legs allocated according to the allocation 
policy

- suspend the mirror with "noflush" flag, "noflush" causes that failing 
bios are queued in device mapper

- swap table with the new one

- resume the mirror, queued buis are dequeued and passed to the new device


Now, the problem:
-----------------

1. If you say that these two operations are independednt, two processes 
will race with suspend and resume on the same device. Bad.

2. If you put lock around, it changes into deadlock possibility: if during 
bio draining or filesystem cleanup dm-raid1 suffers a failure, the failure 
can't be recovered.

3. If you are suspending without DM_SUSPEND_NOFLUSH_FLAG, DM_ENDIO_REQUEUE 
is not allowd and requests returned with DM_ENDIO_REQUEUE are returned 
with -EIO (see function dec_pending). So if mirror leg or log failure 
happens, dm-raid1 returns DM_ENDIO_REQUEUE and the I/O is incorrectly 
finished with -EIO. If you remove this DM_ENDIO_REQUEUE->-EIO logic from 
dec_pending, go to case 2 above (deadlock).


As of the possibility "it is very improbable" --- I think there is one 
case where the probability may be more than minimal. If the user has a 
mounted filesystem and doesn't use it for long time, the disk may have 
failed (or be unplugged) and the system doesn't notice it because the disk 
isn't used. Now, if the user creates a snapshot of mirror and it starts 
cleaning up filesystem journal, it may be the point where the disk error 
is detected. But it can't be repaired.

I think it isn't easy to fix (see those 3 points above), the only possible 
ways to fix it would be:

- make the mirror self-sufficient (integrate md) 
or
- attach dummy dm-linear (or snapshot-origin) passthrough target on the 
top of each mirror. If we do it, snapshot creation could suspend this 
dummy passthrough target and simultaneously dmeventd could suspend the 
underlying mirror and there would be no race or deadlock.

Mikulas

Comment 11 Jonathan Earl Brassow 2012-05-07 15:11:01 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,3 +1 @@
-[STILL BEING EDITED]
+LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type.  The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots.  As a result, snapshots of "mirror" logical volumes will not be supported with the exception of logical volumes created using the "raid1" segment type.  For more information on creating RAID logical volumes in LVM, refer to the LVM Guide, "Logical Volume Manager Administration".-
-Snapshots of mirror logical volumes are not supported with the exception of volumes using RAID mirror segment type where snapshots are still possible.

Comment 12 Alasdair Kergon 2012-05-07 19:22:21 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1,3 @@
+[STILL BEING EDITED]
+
 LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type.  The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots.  As a result, snapshots of "mirror" logical volumes will not be supported with the exception of logical volumes created using the "raid1" segment type.  For more information on creating RAID logical volumes in LVM, refer to the LVM Guide, "Logical Volume Manager Administration".

Comment 15 Jonathan Earl Brassow 2012-05-07 21:36:59 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,3 +1,9 @@
 [STILL BEING EDITED]
 
-LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type.  The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots.  As a result, snapshots of "mirror" logical volumes will not be supported with the exception of logical volumes created using the "raid1" segment type.  For more information on creating RAID logical volumes in LVM, refer to the LVM Guide, "Logical Volume Manager Administration".+LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type.  The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots.  As a result, users who employ snapshots of mirrored volumes are encouraged to use the "raid1" segment type rather than the "mirror" segment type.  Users who continue to use the "mirror" segment type as the origin LV for snapshots should plan for the possibility of the following disruptions.
+
+When a snapshot is created or resized, it forces I/O through the underlying origin.  The operation will not complete until this occurs.  If a device failure occurs to a mirrored logical volume (of "mirror" segment type) that is the origin of the snapshot being created or resized, it will delay I/O until it is reconfigured.  The mirror cannot be reconfigured until the snapshot operation completes, but the snapshot operation cannot complete unless the mirror releases the I/O.  Again, the problem can manifest itself when the mirror suffers a failure simultaneously with a snapshot creation or resize.
+
+There is no current solution to this problem beyond converting the mirror from the "mirror" segment type to the "raid1" segment type.  In order to convert an existing mirror from the "mirror" segment type to the "raid1" segment type, perform the following action:
+~> lvconvert --type raid1 <VG>/<mirrored LV>
+This operation can only be undone via 'vgcfgrestore'.

Comment 19 Corey Marthaler 2012-05-17 22:18:31 UTC
Verified new snapshot of mirrors warning as well as ran basic snapshot of mirrors regression tests. Marking verified in the latest rpms.

2.6.32-269.el6.x86_64
lvm2-2.02.95-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
lvm2-libs-2.02.95-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
lvm2-cluster-2.02.95-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
udev-147-2.41.el6    BUILT: Thu Mar  1 13:01:08 CST 2012
device-mapper-1.02.74-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
device-mapper-libs-1.02.74-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
device-mapper-event-1.02.74-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
device-mapper-event-libs-1.02.74-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012
cmirror-2.02.95-9.el6    BUILT: Wed May 16 10:34:14 CDT 2012

Comment 20 Jonathan Earl Brassow 2012-05-22 18:46:15 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,5 +1,3 @@
-[STILL BEING EDITED]
-
 LVM now has two implementations for creating mirrored logical volumes - the "mirror" segment type and the "raid1" segment type.  The "raid1" segment type contains design improvements over the "mirror" segment type that are useful to its operation with snapshots.  As a result, users who employ snapshots of mirrored volumes are encouraged to use the "raid1" segment type rather than the "mirror" segment type.  Users who continue to use the "mirror" segment type as the origin LV for snapshots should plan for the possibility of the following disruptions.
 
 When a snapshot is created or resized, it forces I/O through the underlying origin.  The operation will not complete until this occurs.  If a device failure occurs to a mirrored logical volume (of "mirror" segment type) that is the origin of the snapshot being created or resized, it will delay I/O until it is reconfigured.  The mirror cannot be reconfigured until the snapshot operation completes, but the snapshot operation cannot complete unless the mirror releases the I/O.  Again, the problem can manifest itself when the mirror suffers a failure simultaneously with a snapshot creation or resize.

Comment 21 Peter Rajnoha 2012-05-23 08:00:21 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -4,4 +4,6 @@
 
 There is no current solution to this problem beyond converting the mirror from the "mirror" segment type to the "raid1" segment type.  In order to convert an existing mirror from the "mirror" segment type to the "raid1" segment type, perform the following action:
 ~> lvconvert --type raid1 <VG>/<mirrored LV>
-This operation can only be undone via 'vgcfgrestore'.+This operation can only be undone via 'vgcfgrestore'.
+
+With current version of LVM2, if the "mirror" segment type is used to create a new mirror LV, a warning message is issued to warn a user about possible problems and it suggests using the "raid1" segment type instead.

Comment 23 errata-xmlrpc 2012-06-20 15:03:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0962.html