Bug 683605

Summary: raid creation doesn't do the right thing with --spares during kickstart
Product: Red Hat Enterprise Linux 6 Reporter: Jeff Vier <jv>
Component: anacondaAssignee: David Lehman <dlehman>
Status: CLOSED ERRATA QA Contact: Release Test Team <release-test-team-automation>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0CC: jstodola, petef
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: anaconda-13.21.107-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 683956 (view as bug list) Environment:
Last Closed: 2011-05-19 12:38:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 690469    
Bug Blocks:    

Description Jeff Vier 2011-03-09 20:30:58 UTC
Description of problem:
This config (snippet from my actual kickstart config):
 part raid.0 --ondisk=sda --asprimary --fstype ext4 --size=10000 --grow
 part raid.1 --ondisk=sdb --asprimary --fstype ext4 --size=10000 --grow
 part raid.2 --ondisk=sdc --asprimary --fstype ext4 --size=10000 --grow
 raid /     --level=1 --device=md0 --spares=1  --fstype=ext4 raid.0 raid.1 raid.2

Should result in a two-device mirror + one spare.

Instead, it's a three-device mirror with no spare.


Also:
 raid /data --level=10 --device=md2 --spares=1 --fstype=ext4 raid.20 raid.21 raid.22 raid.23 raid.24 raid.25 raid.26 raid.27 raid.28

Results in a 9-device RAID 10 device.  I'm not even sure how that's possible :)


Version-Release number of selected component (if applicable):
uname -a:
Linux sync13.db.scl2.svc.mozilla.com 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

mdadm is v1.1, if that matters.

How reproducible:
Every time.

Steps to Reproduce:
1. Configure kickstart as above
2. kickstart
3. mdadm --detail /dev/mdN

Note "Active Devices" vs "Spare Devices"

Further confirmed with `iostat -dx 1 | grep sd[abc]` while doing a heavy disk-write.  All three devices are lockstep in their write load.
  
Actual results:
No spares -- all devices are "active".

Expected results:
Number of spares specified used as spares.

Additional info:
Examples above are ext4, but I have reproduced with ext2, as well.

Comment 1 Jeff Vier 2011-03-09 20:41:24 UTC
Also, in the mean time, can you recommend a work-around to "convert" an active device to a spare?

Comment 3 David Lehman 2011-03-09 21:57:56 UTC
Look at the mdadm man page, in the section titled "MANAGE MODE" (you can search for that text). It gives an example of a single command to do exactly what you want done.

Comment 4 Jeff Vier 2011-03-09 22:15:57 UTC
(In reply to comment #3)
> Look at the mdadm man page, in the section titled "MANAGE MODE" (you can search
> for that text). It gives an example of a single command to do exactly what you
> want done.

Yeah, I tried that -- it does not re-add as a spare, but just back in as an active member.

# mdadm /dev/md127 -f /dev/sdc2 -r /dev/sdc2 -a /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md127
mdadm: hot removed /dev/sdc2 from /dev/md127
mdadm: re-added /dev/sdc2

# mdadm --detail /dev/md127
/dev/md127:
        Version : 1.0
  Creation Time : Wed Feb  9 00:56:40 2011
     Raid Level : raid1
     Array Size : 102388 (100.01 MiB 104.85 MB)
  Used Dev Size : 102388 (100.01 MiB 104.85 MB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Wed Mar  9 14:14:39 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost.localdomain:1
           UUID : e3182825:0d34dd8a:a03bddb0:9006954f
         Events : 80

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2

Comment 5 David Lehman 2011-03-09 23:31:30 UTC
If you read the description of the --add command you'll see how that behavior is determined. It looks like you'd need to wipe the device of all metadata in order for it be re-added as a spare instead of as an active member. The wipefs utility could be used to do that in between the remove and add operations. It would look like this:

  mdadm -f /dev/sdc2
  mdadm -r /dev/sdc2
  wipefs -a /dev/sdc2
  mdadm /dev/md127 -a /dev/sdc2

WARNING: the wipefs command will invalidate/destroy any data on /dev/sdc2

Comment 6 Jeff Vier 2011-03-09 23:44:24 UTC
Still no dice (I had actually done an attempt at the same kind of thing with dd, pushing /dev/zero and then /dev/urandom onto /dev/sdc2).

Here's what I did just now (per your instructions):

 # mdadm --manage /dev/md127 --fail /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md127
 # mdadm --manage /dev/md127 --remove faulty
mdadm: hot removed 8:34 from /dev/md127
 # wipefs -a /dev/sdc2
2 bytes [53 ef] erased at offset 0x438 (ext2)
4 bytes [fc 4e 2b a9] erased at offset 0x63fe000 (linux_raid_member)
 # mdadm --manage /dev/md127 --add /dev/sdc2
mdadm: added /dev/sdc2
 # mdadm --detail /dev/md127
/dev/md127:
        Version : 1.0
  Creation Time : Wed Feb  9 00:56:40 2011
     Raid Level : raid1
     Array Size : 102388 (100.01 MiB 104.85 MB)
  Used Dev Size : 102388 (100.01 MiB 104.85 MB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Wed Mar  9 15:43:13 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost.localdomain:1
           UUID : e3182825:0d34dd8a:a03bddb0:9006954f
         Events : 220

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       3       8       34        2      active sync   /dev/sdc2

Comment 7 Jeff Vier 2011-03-10 19:36:05 UTC
Should *that* behavior (mdadm really, really wanting to drag the device back in as an active member, despite the man page describing differently) be a separate bug?

Comment 8 David Lehman 2011-03-10 19:41:24 UTC
Yes -- it probably should.

Comment 9 Jeff Vier 2011-03-10 20:23:31 UTC
(In reply to comment #8)
> Yes -- it probably should.

done in bug 683976

Comment 10 RHEL Program Management 2011-03-16 16:30:01 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 13 Jan Stodola 2011-04-04 06:58:49 UTC
tested on build RHEL6.1-20110330.2 with anaconda-13.21.108-1.el6 using kickstart with the following part:

part raid.0 --ondisk=dasdc --asprimary --fstype ext4 --size=2000 --grow
part raid.1 --ondisk=dasdd --asprimary --fstype ext4 --size=2000 --grow
part raid.2 --ondisk=dasde --asprimary --fstype ext4 --size=2000 --grow
raid /      --level=1 --device=md0 --spares=1  --fstype=ext4 raid.0 raid.1 raid.2

After the installation, there was one spare device:

[root@rtt6 ~]# mdadm --detail /dev/md0 
/dev/md0:
        Version : 1.1
  Creation Time : Mon Apr  4 02:52:48 2011
     Raid Level : raid1
     Array Size : 2402952 (2.29 GiB 2.46 GB)
  Used Dev Size : 2402952 (2.29 GiB 2.46 GB)
   Raid Devices : 2
  Total Devices : 3
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Apr  4 02:58:53 2011
          State : active
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

           Name : 
           UUID : 9afb610e:50af809f:6e17fcbe:bbde0244
         Events : 32

    Number   Major   Minor   RaidDevice State
       0      94        1        0      active sync   /dev/dasda1
       1      94        5        1      active sync   /dev/dasdb1

       2      94        9        -      spare   /dev/dasdc1

Moving to VERIFIED.

Comment 14 errata-xmlrpc 2011-05-19 12:38:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0530.html