Bug 1358532

Summary:	raid6 --nosync flag create regression
Product:	Red Hat Enterprise Linux 7	Reporter:	Corey Marthaler <cmarthal>
Component:	lvm2	Assignee:	Heinz Mauelshagen <heinzm>
lvm2 sub component:	Mirroring and RAID	QA Contact:	cluster-qe <cluster-qe>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	low
Priority:	unspecified	CC:	agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version:	7.3	Keywords:	Regression
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	lvm2-2.02.163-1.el7	Doc Type:	Bug Fix
Doc Text:	Cause: dm-raid needs to reject the 'nosync' flag on raid6 LVs, because the MD raid6 personality used to drive such RaidLVs does read-modify-write and needs a full initial synchronization to work properly. Consequence: Without initial synchronization there'll be potential data loss in case of device failures, because user data will be reconstructed from non-initialized parity blocks (P-/Q-Syndrome blocks). Fix: Prohibit the 'nosync' flag in the dm-raid target on raid6 mappings as already done in upstream/RHEL7 Result: Userspace regression as of $Subject of this bz.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-11-04 04:16:04 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2016-07-20 23:02:48 UTC

Description of problem:
This used to work in earlier rhel7.3 kernels It works in 3.10.0-419. I'm seeing these failures in  3.10.0-468 and 3.10.0-467.


[root@host-075 ~]# pvcreate /dev/sd[abcdefgh]1
  Physical volume "/dev/sda1" successfully created.
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdd1" successfully created.
  Physical volume "/dev/sde1" successfully created.
  Physical volume "/dev/sdf1" successfully created.
  Physical volume "/dev/sdg1" successfully created.
  Physical volume "/dev/sdh1" successfully created.
[root@host-075 ~]# vgcreate test /dev/sd[abcdefgh]1
  Volume group "test" successfully created

[root@host-075 ~]# lvcreate  --nosync --type raid1 -n one -L 2G test
  WARNING: New raid1 won't be synchronised. Don't read what you didn't write!
  Logical volume "one" created.

[root@host-075 ~]# lvcreate  --nosync --type raid4 -n four -L 2G test
  Rounding size 2.00 GiB (512 extents) up to stripe boundary size 2.02 GiB (518 extents).
  WARNING: New raid4 won't be synchronised. Don't read what you didn't write!
  Logical volume "four" created.

[root@host-075 ~]# lvcreate  --nosync --type raid5 -n five -L 2G test
  Rounding size 2.00 GiB (512 extents) up to stripe boundary size 2.02 GiB (518 extents).
  WARNING: New raid5 won't be synchronised. Don't read what you didn't write!
  Logical volume "five" created.

[root@host-075 ~]# lvcreate  --nosync --type raid6 -n six -L 2G test
  Rounding size 2.00 GiB (512 extents) up to stripe boundary size 2.02 GiB (516 extents).
  WARNING: New raid6 won't be synchronised. Don't read what you didn't write!
  device-mapper: reload ioctl on (253:57) failed: Invalid argument
  Failed to activate new LV.

Jul 20 17:53:19 host-075 kernel: device-mapper: table: 253:57: raid: Invalid flags combination
Jul 20 17:53:19 host-075 kernel: device-mapper: ioctl: error adding target to table


[root@host-075 ~]# lvcreate  --type raid6 -n six2 -L 2G test
  Rounding size 2.00 GiB (512 extents) up to stripe boundary size 2.02 GiB (516 extents).
  Logical volume "six2" created.


Version-Release number of selected component (if applicable):
3.10.0-468.el7.x86_64

lvm2-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
lvm2-libs-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
lvm2-cluster-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-libs-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-event-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-event-libs-1.02.131-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
device-mapper-persistent-data-0.6.2-1.el7    BUILT: Mon Jul 11 04:32:34 CDT 2016
cmirror-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016
sanlock-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
sanlock-lib-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
lvm2-lockd-2.02.161-2.el7    BUILT: Wed Jul 20 07:48:14 CDT 2016


How reproducible:
Everytime

Comment 3 Heinz Mauelshagen 2016-08-04 00:44:14 UTC

Neil Brown emphasized last year that a raid6 array has to be always initially synced to ensure proper P+Q syndromes or there'll be a potential for data corruption in degraded arrays.
Clarifying if that's correct.
If so, I'll add a patch to lvm2 to ignore the nosync flag on raid6.

Comment 4 Heinz Mauelshagen 2016-08-08 10:10:32 UTC

Due to following statement from the current maintainer,
that issue got addressed. Preparing dm-raid kernel patch to allow
for raid6 nosync flag to be processed.



Hi,
Since 2014, raid6 supports read-modify-write, so no initial sync could
make data lost.

Thanks,
Shaohua

2016-08-03 17:41 GMT-07:00 Heinz Mauelshagen <heinzm>:
>
> Hi Shaohua,
>
> Neil told me last year, that raid6 P+Q syndromes needs to be created, i.e.
> the array always initially synchronized.
>
> Is that true and raid6 may cause data corruption in case of no initial sync
> or does it always recreate both P+Q syndromes on a stripe update?
>
> Heinz
>

Comment 5 Heinz Mauelshagen 2016-08-08 10:36:07 UTC

(In reply to Heinz Mauelshagen from comment #4)
> Due to following statement from the current maintainer,
> that issue got addressed. Preparing dm-raid kernel patch to allow
> for raid6 nosync flag to be processed.

I misread Shaoha's comment, thus Neil's statement is still correct and we need
to reject the 'nosync' flag for raid6 as is in order to compute correct data
in case of 1-2 device failures.

This means that any kernel supporting MD raid6 read-modify-write used to drive an lvm nosync raid6 LV is potentially subject to data loss in case of a device failure unless e.g. "lvchange --resync RaidLV" was run successfully.

LVM2 userspace lvcreate command needs enhancement to fail early on 'nosync' requested for raid6 LVs and provide a failure message rather than failing late in the kernel.

> 
> 
> 
> Hi,
> Since 2014, raid6 supports read-modify-write, so no initial sync could
> make data lost.
> 
> Thanks,
> Shaohua
> 
> 2016-08-03 17:41 GMT-07:00 Heinz Mauelshagen <heinzm>:
> >
> > Hi Shaohua,
> >
> > Neil told me last year, that raid6 P+Q syndromes needs to be created, i.e.
> > the array always initially synchronized.
> >
> > Is that true and raid6 may cause data corruption in case of no initial sync
> > or does it always recreate both P+Q syndromes on a stripe update?
> >
> > Heinz
> >

Comment 6 Heinz Mauelshagen 2016-08-08 12:48:04 UTC

LVM2 upstream commit 7d6cf12554b0b6cbbee9db2f4da9439d6492a2b3 to reject nosync option on raid6 creation including manual page update/fix and test script.

Comment 8 Corey Marthaler 2016-08-10 20:57:34 UTC

Verified that --nosync is no longer allowed w/ raid6.

[root@host-083 ~]# lvcreate  --nosync --type raid6 -n six -L 2G test
  Using default stripesize 64.00 KiB.
  nosync option prohibited on RAID6
  Run `lvcreate --help' for more information.

3.10.0-489.el7.x86_64
lvm2-2.02.163-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
lvm2-libs-2.02.163-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
lvm2-cluster-2.02.163-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
device-mapper-1.02.133-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
device-mapper-libs-1.02.133-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
device-mapper-event-1.02.133-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
device-mapper-event-libs-1.02.133-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016
cmirror-2.02.163-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016
sanlock-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
sanlock-lib-3.4.0-1.el7    BUILT: Fri Jun 10 11:41:03 CDT 2016
lvm2-lockd-2.02.163-1.el7    BUILT: Wed Aug 10 06:53:21 CDT 2016

Comment 10 errata-xmlrpc 2016-11-04 04:16:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html