Bug 1399810 - RAID0 grow/add leaves md device in RAID4
Summary: RAID0 grow/add leaves md device in RAID4
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Nigel Croxon
QA Contact: guazhang@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-29 19:14 UTC by John Pittman
Modified: 2020-01-17 16:16 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-06-19 21:01:56 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2785891 0 None None None 2016-11-29 19:59:11 UTC

Description John Pittman 2016-11-29 19:14:22 UTC
Description of problem:

When adding a device to and increasing the size of a RAID0 md device, at reshape, the device is left in RAID4.

Version-Release number of selected component (if applicable):

kernel-3.10.0-514.el7.x86_64
mdadm-3.4-14.el7.x86_64

[root@localhost ~]# mdadm --version
mdadm - v3.4 - 28th January 2016

How reproducible:

[root@localhost ~]# mdadm --create --verbose /dev/md22 --level=0 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm: chunk size defaults to 512K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md22 started.
 
[root@localhost ~]# mdadm --grow /dev/md22 --level=0 --raid-devices=3 --add /dev/sdc1
mdadm: level of /dev/md22 changed to raid4
mdadm: added /dev/sdc1
mdadm: Need to backup 3072K of critical section..
 
[root@localhost ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0]
md22 : active raid4 sdc1[3] sdb1[1] sda1[0]
      936960 blocks super 1.2 level 4, 512k chunk, algorithm 5 [4/3] [UUU_]
     
unused devices: <none>

Log snip from event:  https://paste.fedoraproject.org/493338/

Output from --detail and --examine:  https://paste.fedoraproject.org/493339/

Actual results:

Raid device is left at raid4

Expected results:

Device should complete and return to raid0 status

Comment 2 John Pittman 2016-11-29 20:23:42 UTC
The device can be converted back to raid0 successfully after the failing command completes.

[root@localhost ~]# mdadm --grow /dev/md33 --level=0
mdadm: level of /dev/md33 changed to raid0

[root@localhost ~]# mdadm --detail /dev/md33 
/dev/md33:
        Version : 1.2
  Creation Time : Tue Nov 29 13:58:58 2016
     Raid Level : raid0
     Array Size : 936960 (915.00 MiB 959.45 MB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Tue Nov 29 15:19:35 2016
          State : clean 
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : localhost.localdomain:33  (local to host localhost.localdomain)
           UUID : fed78cd0:5e0c96eb:0c3fcfbb:823d2086
         Events : 25

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       3       8       33        2      active sync   /dev/sdc1

Comment 3 Nigel Croxon 2017-05-12 18:27:45 UTC
Hello John,

As you said:
mdadm --grow /dev/md33 --level=0
Will change the current configuration to RAID 0.

RAID 4 with only 4 out of it's 5 members is essentially a RAID 0 without a parity stripe.  There is no conversion needed when you are in this situation.

This is not a bug.  Upstream code works the same way as RHEL.

-Nigel

Comment 4 John Pittman 2017-05-12 20:20:15 UTC
Hi Nigel, thanks for looking at this.  The issue is that, in my test case, there are only 3 devices.

I am going from 2 stripes to 3 stripes. So the results of the grow command should be a 3 stripe raid0, not a raid4 device.

Comment 5 Nigel Croxon 2017-05-12 21:06:42 UTC
I tested your configuration. I believe my same answer applies.
But I will continue to investigate this.

-Nigel

Comment 7 Varun Vellanki 2019-03-17 19:04:02 UTC
Any update on this fix? I am still seeing the same issue on RHEL 7.5 with Kernel 3.10.0-862.27.1.
cat /etc/redhat-release

Red Hat Enterprise Linux Server release 7.5 (Maipo)
uname -r

3.10.0-862.27.1.el7.x86_64
mdadm --create --verbose /dev/md124 --level=0 --raid-devices=2 /dev/sdf1 /dev/sdg1

mdadm: chunk size defaults to 512K mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md124 started
mdadm --manage /dev/md124 --add /dev/sdh1

mdadm: add new device failed for /dev/sdh1 as 2: Invalid argument
mdadm --add /dev/md124 /dev/sdh1

mdadm: add new device failed for /dev/sdh1 as 2: Invalid argument
mdadm --grow /dev/md124 --level=0 --raid-devices=3 --add /dev/sdh1

mdadm: level of /dev/md124 changed to raid4 mdadm: added /dev/sdh1
cat /proc/mdstat

Personalities : [raid0] [raid6] [raid5] [raid4] md124 : active raid4 sdh1[3] sdg1[1] sdf1[0] 1073475584 blocks super 1.2 level 4, 512k chunk, algorithm 5 [4/3] [UU__] [=>...................] reshape = 8.1% (43810936/536737792) finish=329.3min speed=24942K/sec

Comment 8 Varun Vellanki 2019-03-18 00:09:50 UTC
Never mind, I could see it as RAID 0 after the mdadm --grow completed. But Reshape took more than 6 hours. Let me do some OS tuning to speed the things up and achieve good grow performance.

# cat /proc/mdstat
Personalities : [raid0] [raid6] [raid5] [raid4]
md124 : active raid0 sdh1[3] sdg1[1] sdf1[0]
      1610213376 blocks super 1.2 512k chunks

# mdadm --detail /dev/md124
/dev/md124:
           Version : 1.2
     Creation Time : Sun Mar 17 11:16:27 2019
        Raid Level : raid0
        Array Size : 1610213376 (1535.62 GiB 1648.86 GB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Sun Mar 17 16:58:34 2019
             State : clean
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

        Chunk Size : 512K

Consistency Policy : none

              Name : 124
              UUID : ab2d157d:6f7f6511:c1ee426b:ce9f7744
            Events : 1713

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync   /dev/sdf1
       1       8       97        1      active sync   /dev/sdg1
       3       8      113        2      active sync   /dev/sdh1


Note You need to log in before you can comment on or make changes to this bug.