Bug 1386184
| Summary: | Activation of RAID4 fails with latest kernel on conversion from striped/raid0[_meta] | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Heinz Mauelshagen <heinzm> | |
| Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> | |
| lvm2 sub component: | Mirroring and RAID | QA Contact: | cluster-qe <cluster-qe> | |
| Status: | CLOSED ERRATA | Docs Contact: | Milan Navratil <mnavrati> | |
| Severity: | urgent | |||
| Priority: | high | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, mthacker, pasik, prajnoha, prockai, rbednar, slevine, yizhan, zkabelac | |
| Version: | 7.3 | Keywords: | ZStream | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | lvm2-2.02.169-1.el7 | Doc Type: | Bug Fix | |
| Doc Text: |
New RAID4 volumes, and existing RAID4 or RAID10 logical volumes after a system upgrade are now correctly activated
After creating RAID4 logical volumes on Red Hat Enterprise Linux version 7.3, or after upgrading a system that has existing RAID4 or RAID10 logical volumes to version 7.3, the system sometimes failed to activate these volumes. With this update, the system activates these volumes successfully.
|
Story Points: | --- | |
| Clone Of: | 1385149 | |||
| : | 1388962 1395562 (view as bug list) | Environment: | ||
| Last Closed: | 2017-08-01 21:49:49 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1385149, 1386194 | |||
| Bug Blocks: | 1388962, 1395562 | |||
|
Description
Heinz Mauelshagen
2016-10-18 11:23:51 UTC
Shifting the striped raid0_meta devices up by one in order to be able to move the new parity device into segment area 0 when converting from raid0_meta to raid4 leads to superblocks with raid0_meta roles 0..N positioned in raid disk 1..N+1 thus failing in the target constructor whilst validating proper roles. That problem doesn't apply to the raid4 -> raid0_meta conversion and it wouldn't apply to raid0_meta -> future raid5_n/raid6_n_6 conversions without the need to shift the parity device(s) either. In a preliminary patch, I remove raid0_meta MetaLVs in that conversion to allow for them to be cleared when allocating them anew not causing aforementioned validations failure because the kernel validates a new RaidLV with cleared superblocks. That causes overhead and may lead to different new MetaLV allocations. Other options would be to detach the MetaLVs, clear them and reattach again or to activate raid0_meta as raid0 (i.e. without metadata devices) to be able to clear them. Looking into that... Upstream commit de78e8eae73c fixing issue by positioning parity SubLVs properly on conversions from/to raid4 pushed. Corey, what do expect to test: previously "lvconvert --type raid4 LV" from "striped" failed when it should work. Fix allows any conversions between striped,raid0,raid0_meta,raid4. Conversion from raid4 -> striped,raid0,raid0_meta comes with a yes/no question now which reminds the user that he's going to loose all resilience. Upstream commit e84f527cd37f for lvconvert reverts de78e8eae73c to only let raid4 through to lv_raid_convert(). Corey, testing shall cover: # lvcreate -i2 --ty striped -L64 -nlv vg # lvconvert --ty raid0 vg/lv # lvconvert --ty raid0_meta vg/lv # lvconvert --ty raid4 vg/lv # lvconvert --ty raid0 vg/lv # lvconvert --ty striped vg/lv Essentially any conversion (as mentioned in comment #5) between striped,raid0,raid0_meta,raid4 shall be possible starting out with any of these raid types. Please test data patterns (e.g. mkfs,fsck) to prove the mapping is proper on conversion of striped,raid0,raid0_meta to and from raid4. Any kernel with dm-raid target version 1.[89].0 (e.g. kernel-3.10.0-515.el7 and older) has a wrong raid4 mapping (DD...P rather than P...DD) as explained in the initial description of this bz. An "lvconvert --y raid4 vg/lv" on such kernel causes data corruption (activation of raid4 on such bogus raid4 mappings gets rejected by the fix as of https://bugzilla.redhat.com/show_bug.cgi?id=1388962). Corey, the problem is when a raid4 is being created or converted to with dm-raid target 1.[89].0 _and_ that raid4 mapping is activated on target version < 1.8.0 or > 1.9.0 (rhel7 kernels < 436 or > 515). Example for bogus mapping with bogus userspace (DDP) working fine but causing data corruption on the raid4 LV when upgrading to kernel > 515: [root@rhel-7-3 ~]# lvm version LVM version: 2.02.166(2)-RHEL7 (2016-09-28) Library version: 1.02.135-RHEL7 (2016-09-28) Driver version: 4.34.0 [root@rhel-7-3 ~]# uname -r 3.10.0-513.el7.x86_64 [root@rhel-7-3 ~]# lvcreate --ty raid0 -i2 -L64 -nr ssd WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). Using default stripesize 64.00 KiB. WARNING: ext4 signature detected on /dev/ssd/r at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/ssd/r. Logical volume "r" created. [root@rhel-7-3 ~]# mkfs -t ext4 /dev/ssd/r mke2fs 1.42.9 (28-Dec-2013) ... [root@rhel-7-3 ~]# fsck -fn /dev/ssd/r fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/ssd-r: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks [root@rhel-7-3 ~]# lvs -ao+segtype,devices ssd WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Type Devices r ssd rwi-a-r--- 64.00m raid0 r_rimage_0(0),r_rimage_1(0) [r_rimage_0] ssd iwi-aor--- 32.00m linear /dev/sda(0) [r_rimage_1] ssd iwi-aor--- 32.00m linear /dev/sdb(0) [root@rhel-7-3 ~]# lvconvert --ty raid4 ssd/r WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). Using default stripesize 64.00 KiB. Logical volume ssd/r successfully converted. [root@rhel-7-3 ~]# fsck -fn /dev/ssd/r fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/ssd-r: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks [root@rhel-7-3 ~]# lvs -ao+segtype,devices ssd WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Type Devices r ssd rwi-a-r--- 64.00m 100.00 raid4 r_rimage_0(0),r_rimage_1(0),r_rimage_2(0) [r_rimage_0] ssd iwi-aor--- 32.00m linear /dev/sda(0) [r_rimage_1] ssd iwi-aor--- 32.00m linear /dev/sdb(0) [r_rimage_2] ssd iwi-aor--- 32.00m linear /dev/sdc(1) [r_rmeta_0] ssd ewi-aor--- 4.00m linear /dev/sda(8) [r_rmeta_1] ssd ewi-aor--- 4.00m linear /dev/sdb(8) [r_rmeta_2] ssd ewi-aor--- 4.00m linear /dev/sdc(0) Corey, installing kernel 522 and booting config as of comment#10 results in data corruption: [root@rhel-7-3 ~]# uname -r 3.10.0-522.el7.x86_64 [root@rhel-7-3 ~]# vgchange -ay ssd 1 logical volume(s) in volume group "ssd" now active [root@rhel-7-3 ~]# lvs -ao+segtype,devices ssd LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Type Devices r ssd rwi-a-r--- 64.00m 100.00 raid4 r_rimage_0(0),r_rimage_1(0),r_rimage_2(0) [r_rimage_0] ssd iwi-aor--- 32.00m linear /dev/sda(0) [r_rimage_1] ssd iwi-aor--- 32.00m linear /dev/sdh(0) [r_rimage_2] ssd iwi-aor--- 32.00m linear /dev/sdg(1) [r_rmeta_0] ssd ewi-aor--- 4.00m linear /dev/sda(8) [r_rmeta_1] ssd ewi-aor--- 4.00m linear /dev/sdh(8) [r_rmeta_2] ssd ewi-aor--- 4.00m linear /dev/sdg(0) [root@rhel-7-3 ~]# fsck -fn /dev/ssd/r fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) ext2fs_open2: Bad magic number in super-block fsck.ext2: Superblock invalid, trying backup blocks... Superblock has an invalid journal (inode 8). Clear? no fsck.ext2: Illegal inode number while checking ext3 journal for /dev/mapper/ssd-r /dev/mapper/ssd-r: ********** WARNING: Filesystem still has errors ********** Verified. LVM now rejects conversion from raid0_meta to raid4 to prevent data corruption when used with incompatible raid module versions. Tested as specified in https://bugzilla.redhat.com/show_bug.cgi?id=1395562#c5 (part A+C) 1) target < 1.9.1 + "old" lvm tools lvm2-2.02.166-1.el7.x86_64 # dmsetup targets | grep raid raid v1.9.0 # lvcreate -i2 --ty striped -L64 -nlv vg Using default stripesize 64.00 KiB. WARNING: ext4 signature detected on /dev/vg/lv at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/vg/lv. Logical volume "lv" created. # mkfs -t ext4 /dev/vg/lv mke2fs 1.42.9 (28-Dec-2013) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) Stride=64 blocks, Stripe width=128 blocks 16384 inodes, 65536 blocks 3276 blocks (5.00%) reserved for the super user First data block=1 Maximum filesystem blocks=33685504 8 block groups 8192 blocks per group, 8192 fragments per group 2048 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345 Allocating group tables: done Writing inode tables: done Creating journal (4096 blocks): done Writing superblocks and filesystem accounting information: done # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty raid0 vg/lv Using default stripesize 64.00 KiB. Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty raid0_meta vg/lv Using default stripesize 64.00 KiB. Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty raid4 vg/lv Using default stripesize 64.00 KiB. Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty striped vg/lv Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks ===================================================================== 2) target 1.9.0 + new lvm tools lvm2-2.02.169-3.el7.x86_64 # dmsetup targets | grep raid raid v1.9.0 # lvcreate -i2 --ty striped -L64 -nlv vg Using default stripesize 64.00 KiB. WARNING: ext4 signature detected on /dev/vg/lv at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/vg/lv. Logical volume "lv" created. # mkfs -t ext4 /dev/vg/lv mke2fs 1.42.9 (28-Dec-2013) Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) Stride=64 blocks, Stripe width=128 blocks 16384 inodes, 65536 blocks 3276 blocks (5.00%) reserved for the super user First data block=1 Maximum filesystem blocks=33685504 8 block groups 8192 blocks per group, 8192 fragments per group 2048 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345 Allocating group tables: done Writing inode tables: done Creating journal (4096 blocks): done Writing superblocks and filesystem accounting information: done # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty raid0 vg/lv Using default stripesize 64.00 KiB. Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks # lvconvert --ty raid0_meta vg/lv Using default stripesize 64.00 KiB. Logical volume vg/lv successfully converted. # fsck -fn /dev/vg/lv fsck from util-linux 2.23.2 e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/mapper/vg-lv: 11/16384 files (9.1% non-contiguous), 7465/65536 blocks (Conversion fails here as expected) # lvconvert --ty raid4 vg/lv Using default stripesize 64.00 KiB. RAID module does not support RAID4. Cannot convert raid0_meta LV vg/lv to raid4. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2222 |