Bug 836381 - upconverting raid volumes can fail due to suspend issues
Summary: upconverting raid volumes can fail due to suspend issues
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
: 836376 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-28 21:19 UTC by Corey Marthaler
Modified: 2013-02-21 08:10 UTC (History)
9 users (show)

Fixed In Version: lvm2-2.02.98-1.el6
Doc Type: Bug Fix
Doc Text:
Attempting to add additional images to a RAID logical volume while the array is not in-sync is not allowed by the kernel. Previously, the LVM RAID code was not checking to see if the logical volume was in-sync and would potentially issue the invalid request which lead to errors. This condition is now checked for and the user is informed that the operation cannot take place until the array is in-sync.
Clone Of:
Environment:
Last Closed: 2013-02-21 08:10:57 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0501 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2013-02-20 21:30:45 UTC

Description Corey Marthaler 2012-06-28 21:19:21 UTC
Description of problem:
Raid upconvert attempts can fail from time to time due to suspend errors.

./raid_sanity -o hayes-01 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -e mirror_up_converts -t raid1 -i 3

============================================================
Iteration 1 of 3 started at Thu Jun 28 14:55:24 CDT 2012
============================================================
SCENARIO (raid1) - [mirror_up_converts]
Create a mirror and then attempt to up convert it
hayes-01: lvcreate --type raid1 -m 1 -n mirror_up_converts -L 100M raid_sanity
Upconvert to 2 redundant legs
raid convert (lvconvert --type raid1 -m 2 raid_sanity/mirror_up_converts)
Upconvert to 3 redundant legs
raid convert (lvconvert --type raid1 -m 3 raid_sanity/mirror_up_converts)
Upconvert to 4 redundant legs
raid convert (lvconvert --type raid1 -m 4 raid_sanity/mirror_up_converts)
Deactivating raid mirror_up_converts... and removing


============================================================
Iteration 2 of 3 started at Thu Jun 28 14:55:38 CDT 2012
============================================================
SCENARIO (raid1) - [mirror_up_converts]
Create a mirror and then attempt to up convert it
hayes-01: lvcreate --type raid1 -m 1 -n mirror_up_converts -L 100M raid_sanity
Upconvert to 2 redundant legs
raid convert (lvconvert --type raid1 -m 2 raid_sanity/mirror_up_converts)
Upconvert to 3 redundant legs
raid convert (lvconvert --type raid1 -m 3 raid_sanity/mirror_up_converts)
Upconvert to 4 redundant legs
raid convert (lvconvert --type raid1 -m 4 raid_sanity/mirror_up_converts)
Deactivating raid mirror_up_converts... and removing


============================================================
Iteration 3 of 3 started at Thu Jun 28 14:55:55 CDT 2012
============================================================
SCENARIO (raid1) - [mirror_up_converts]
Create a mirror and then attempt to up convert it
hayes-01: lvcreate --type raid1 -m 1 -n mirror_up_converts -L 100M raid_sanity
Upconvert to 2 redundant legs
raid convert (lvconvert --type raid1 -m 2 raid_sanity/mirror_up_converts)
  device-mapper: reload ioctl on  failed: Invalid argument
  Failed to suspend raid_sanity/mirror_up_converts before committing changes
  Device '/dev/etherd/e1.1p5' has been left open.
  Device '/dev/etherd/e1.1p10' has been left open.
  Device '/dev/etherd/e1.1p6' has been left open.
  Device '/dev/etherd/e1.1p7' has been left open.
  Device '/dev/etherd/e1.1p6' has been left open.
  Device '/dev/etherd/e1.1p1' has been left open.
  Device '/dev/etherd/e1.1p2' has been left open.
  Device '/dev/etherd/e1.1p5' has been left open.
  Device '/dev/etherd/e1.1p9' has been left open.
  Device '/dev/etherd/e1.1p8' has been left open.
  Device '/dev/etherd/e1.1p3' has been left open.
  Device '/dev/etherd/e1.1p4' has been left open.
  Device '/dev/etherd/e1.1p7' has been left open.
  Device '/dev/etherd/e1.1p4' has been left open.
  Device '/dev/etherd/e1.1p2' has been left open.
  Device '/dev/etherd/e1.1p8' has been left open.
  Device '/dev/etherd/e1.1p9' has been left open.
  Device '/dev/etherd/e1.1p3' has been left open.
  Device '/dev/etherd/e1.1p10' has been left open.
  Device '/dev/etherd/e1.1p1' has been left open.
should have been able to up convert mirror


Version-Release number of selected component (if applicable):
2.6.32-278.el6.x86_64
lvm2-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-libs-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-cluster-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
udev-147-2.41.el6    BUILT: Thu Mar  1 13:01:08 CST 2012
device-mapper-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
cmirror-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012

Comment 1 Corey Marthaler 2012-06-28 21:22:37 UTC
after failure:

[root@hayes-01 bin]# lvs -a -o +devices
  LV                            VG          Attr     LSize   Copy%  Devices
  mirror_up_converts            raid_sanity rwi-a-m- 100.00m 100.00 mirror_up_converts_rimage_0(0),mirror_up_converts_rimage_1(0)
  [mirror_up_converts_rimage_0] raid_sanity iwi-aor- 100.00m        /dev/etherd/e1.1p9(1)
  [mirror_up_converts_rimage_1] raid_sanity iwi-aor- 100.00m        /dev/etherd/e1.1p8(1)
  mirror_up_converts_rimage_2   raid_sanity -wi-a--- 100.00m        /dev/etherd/e1.1p7(1)
  [mirror_up_converts_rmeta_0]  raid_sanity ewi-aor-   4.00m        /dev/etherd/e1.1p9(0)
  [mirror_up_converts_rmeta_1]  raid_sanity ewi-aor-   4.00m        /dev/etherd/e1.1p8(0)
  mirror_up_converts_rmeta_2    raid_sanity -wi-a---   4.00m        /dev/etherd/e1.1p7(0)


[root@hayes-01 bin]# dmsetup ls
raid_sanity-mirror_up_converts_rimage_2 (253:9)
raid_sanity-mirror_up_converts_rimage_1 (253:6)
raid_sanity-mirror_up_converts_rimage_0 (253:4)
raid_sanity-mirror_up_converts  (253:7)
raid_sanity-mirror_up_converts_rmeta_2  (253:8)
raid_sanity-mirror_up_converts_rmeta_1  (253:5)
raid_sanity-mirror_up_converts_rmeta_0  (253:3)

Comment 2 Peter Rajnoha 2012-07-18 11:13:26 UTC
*** Bug 836376 has been marked as a duplicate of this bug. ***

Comment 3 Jonathan Earl Brassow 2012-09-10 21:32:00 UTC
Any information from the system log on why the suspend failed?

Comment 4 Jonathan Earl Brassow 2012-09-10 21:35:16 UTC
Also, is the array in-sync when the convert is issued?

Comment 5 Jonathan Earl Brassow 2012-09-10 22:06:43 UTC
This never managed to hit the problem:
[root@hayes-01 ~]# while lvcreate -m1 --type raid1 -L 100M -n lv vg && sleep 20 && lvconvert --type raid1 -m 2 vg/lv; do lvremove -ff vg; done

But this did (and right away):
[root@hayes-01 ~]# while lvcreate -m1 --type raid1 -L 100M -n lv vg && lvconvert --type raid1 -m 2 vg/lv; do lvremove -ff vg; done
  Logical volume "lv" created
  device-mapper: reload ioctl on  failed: Invalid argument
  Failed to suspend vg/lv before committing changes
  Device '/dev/etherd/e1.1p5' has been left open.
  Device '/dev/etherd/e1.1p6' has been left open.
  Device '/dev/etherd/e1.1p7' has been left open.
  Device '/dev/etherd/e1.1p6' has been left open.
  Device '/dev/etherd/e1.1p1' has been left open.
  Device '/dev/etherd/e1.1p2' has been left open.
  Device '/dev/etherd/e1.1p5' has been left open.
  Device '/dev/etherd/e1.1p8' has been left open.
  Device '/dev/etherd/e1.1p3' has been left open.
  Device '/dev/etherd/e1.1p4' has been left open.
  Device '/dev/etherd/e1.1p7' has been left open.
  Device '/dev/etherd/e1.1p4' has been left open.
  Device '/dev/etherd/e1.1p2' has been left open.
  Device '/dev/etherd/e1.1p8' has been left open.
  Device '/dev/etherd/e1.1p3' has been left open.
  Device '/dev/etherd/e1.1p1' has been left open.

The difference is the 'sleep 20' which I added to make sure the array was 'in-sync' before attempting the convert.

From the system log:
Sep 10 17:00:44 hayes-01 kernel: device-mapper: raid: 'rebuild' specified while array is not in-sync

The LVM code should not allow converts while the array is not in-sync; otherwise, the kernel will simply reject it.

Comment 6 Jonathan Earl Brassow 2012-09-10 22:20:17 UTC
commit cdb0339319c89b8d1e5611537e9775a8c3ce5844
Author: Jonathan Brassow <jbrassow@redhat.com>
Date:   Mon Sep 10 17:15:20 2012 -0500

    RAID:  Disallow addition of RAID images while array is not in-sync
    
    We cannot add images to a RAID array while it is not in-sync.  The
    kernel will simply reject the table, saying:
        'rebuild' specified while array is not in-sync
    Now we check to ensure the LV is in-sync before attempting image
    additions.

Comment 7 Jonathan Earl Brassow 2012-09-10 22:21:40 UTC
Unit test:

[root@hayes-01 ~]# while lvcreate -m1 --type raid1 -L 100M -n lv vg && lvconvert --type raid1 -m 2 vg/lv; do lvremove -ff vg; done
  Logical volume "lv" created
  Unable to add RAID images until lv is in-sync

Comment 10 Corey Marthaler 2012-11-14 21:23:14 UTC
Fix verified in the latest rpms.

2.6.32-339.el6.x86_64
lvm2-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
lvm2-libs-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
lvm2-cluster-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
udev-147-2.43.el6    BUILT: Thu Oct 11 05:59:38 CDT 2012
device-mapper-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-libs-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-event-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
device-mapper-event-libs-1.02.77-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012
cmirror-2.02.98-3.el6    BUILT: Mon Nov  5 06:45:48 CST 2012


============================================================
Iteration 40 of 40 started at Wed Nov 14 15:22:26 CST 2012
============================================================
SCENARIO (raid1) - [raid_up_converts]
Create a mirror and then attempt to up convert it
taft-01: lvcreate --type raid1 -m 1 -n raid_up_converts -L 100M raid_sanity
Upconvert to 2 redundant legs
raid convert (lvconvert --type raid1 -m 2 raid_sanity/raid_up_converts)
Upconvert to 3 redundant legs
raid convert (lvconvert --type raid1 -m 3 raid_sanity/raid_up_converts)
Upconvert to 4 redundant legs
raid convert (lvconvert --type raid1 -m 4 raid_sanity/raid_up_converts)
Deactivating raid raid_up_converts... and removing

Comment 11 errata-xmlrpc 2013-02-21 08:10:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0501.html


Note You need to log in before you can comment on or make changes to this bug.