Bug 683605

Summary:	raid creation doesn't do the right thing with --spares during kickstart
Product:	Red Hat Enterprise Linux 6	Reporter:	Jeff Vier <jv>
Component:	anaconda	Assignee:	David Lehman <dlehman>
Status:	CLOSED ERRATA	QA Contact:	Release Test Team <release-test-team-automation>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	6.0	CC:	jstodola, petef
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	anaconda-13.21.107-1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	683956 (view as bug list)		Environment:
Last Closed:	2011-05-19 12:38:48 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	690469
Bug Blocks:

Description Jeff Vier 2011-03-09 20:30:58 UTC

Description of problem:
This config (snippet from my actual kickstart config):
 part raid.0 --ondisk=sda --asprimary --fstype ext4 --size=10000 --grow
 part raid.1 --ondisk=sdb --asprimary --fstype ext4 --size=10000 --grow
 part raid.2 --ondisk=sdc --asprimary --fstype ext4 --size=10000 --grow
 raid /     --level=1 --device=md0 --spares=1  --fstype=ext4 raid.0 raid.1 raid.2

Should result in a two-device mirror + one spare.

Instead, it's a three-device mirror with no spare.


Also:
 raid /data --level=10 --device=md2 --spares=1 --fstype=ext4 raid.20 raid.21 raid.22 raid.23 raid.24 raid.25 raid.26 raid.27 raid.28

Results in a 9-device RAID 10 device.  I'm not even sure how that's possible :)


Version-Release number of selected component (if applicable):
uname -a:
Linux sync13.db.scl2.svc.mozilla.com 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

mdadm is v1.1, if that matters.

How reproducible:
Every time.

Steps to Reproduce:
1. Configure kickstart as above
2. kickstart
3. mdadm --detail /dev/mdN

Note "Active Devices" vs "Spare Devices"

Further confirmed with `iostat -dx 1 | grep sd[abc]` while doing a heavy disk-write.  All three devices are lockstep in their write load.
  
Actual results:
No spares -- all devices are "active".

Expected results:
Number of spares specified used as spares.

Additional info:
Examples above are ext4, but I have reproduced with ext2, as well.

Comment 1 Jeff Vier 2011-03-09 20:41:24 UTC

Also, in the mean time, can you recommend a work-around to "convert" an active device to a spare?

Comment 3 David Lehman 2011-03-09 21:57:56 UTC

Look at the mdadm man page, in the section titled "MANAGE MODE" (you can search for that text). It gives an example of a single command to do exactly what you want done.

Comment 4 Jeff Vier 2011-03-09 22:15:57 UTC

(In reply to comment #3)
> Look at the mdadm man page, in the section titled "MANAGE MODE" (you can search
> for that text). It gives an example of a single command to do exactly what you
> want done.

Yeah, I tried that -- it does not re-add as a spare, but just back in as an active member.

# mdadm /dev/md127 -f /dev/sdc2 -r /dev/sdc2 -a /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md127
mdadm: hot removed /dev/sdc2 from /dev/md127
mdadm: re-added /dev/sdc2

# mdadm --detail /dev/md127
/dev/md127:
        Version : 1.0
  Creation Time : Wed Feb  9 00:56:40 2011
     Raid Level : raid1
     Array Size : 102388 (100.01 MiB 104.85 MB)
  Used Dev Size : 102388 (100.01 MiB 104.85 MB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Wed Mar  9 14:14:39 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost.localdomain:1
           UUID : e3182825:0d34dd8a:a03bddb0:9006954f
         Events : 80

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       8       34        2      active sync   /dev/sdc2

Comment 5 David Lehman 2011-03-09 23:31:30 UTC

If you read the description of the --add command you'll see how that behavior is determined. It looks like you'd need to wipe the device of all metadata in order for it be re-added as a spare instead of as an active member. The wipefs utility could be used to do that in between the remove and add operations. It would look like this:

  mdadm -f /dev/sdc2
  mdadm -r /dev/sdc2
  wipefs -a /dev/sdc2
  mdadm /dev/md127 -a /dev/sdc2

WARNING: the wipefs command will invalidate/destroy any data on /dev/sdc2

Comment 6 Jeff Vier 2011-03-09 23:44:24 UTC

Still no dice (I had actually done an attempt at the same kind of thing with dd, pushing /dev/zero and then /dev/urandom onto /dev/sdc2).

Here's what I did just now (per your instructions):

 # mdadm --manage /dev/md127 --fail /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md127
 # mdadm --manage /dev/md127 --remove faulty
mdadm: hot removed 8:34 from /dev/md127
 # wipefs -a /dev/sdc2
2 bytes [53 ef] erased at offset 0x438 (ext2)
4 bytes [fc 4e 2b a9] erased at offset 0x63fe000 (linux_raid_member)
 # mdadm --manage /dev/md127 --add /dev/sdc2
mdadm: added /dev/sdc2
 # mdadm --detail /dev/md127
/dev/md127:
        Version : 1.0
  Creation Time : Wed Feb  9 00:56:40 2011
     Raid Level : raid1
     Array Size : 102388 (100.01 MiB 104.85 MB)
  Used Dev Size : 102388 (100.01 MiB 104.85 MB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Wed Mar  9 15:43:13 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost.localdomain:1
           UUID : e3182825:0d34dd8a:a03bddb0:9006954f
         Events : 220

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       3       8       34        2      active sync   /dev/sdc2

Comment 7 Jeff Vier 2011-03-10 19:36:05 UTC

Should *that* behavior (mdadm really, really wanting to drag the device back in as an active member, despite the man page describing differently) be a separate bug?

Comment 8 David Lehman 2011-03-10 19:41:24 UTC

Yes -- it probably should.

Comment 9 Jeff Vier 2011-03-10 20:23:31 UTC

(In reply to comment #8)
> Yes -- it probably should.

done in bug 683976

Comment 10 RHEL Program Management 2011-03-16 16:30:01 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 13 Jan Stodola 2011-04-04 06:58:49 UTC

tested on build RHEL6.1-20110330.2 with anaconda-13.21.108-1.el6 using kickstart with the following part:

part raid.0 --ondisk=dasdc --asprimary --fstype ext4 --size=2000 --grow
part raid.1 --ondisk=dasdd --asprimary --fstype ext4 --size=2000 --grow
part raid.2 --ondisk=dasde --asprimary --fstype ext4 --size=2000 --grow
raid /      --level=1 --device=md0 --spares=1  --fstype=ext4 raid.0 raid.1 raid.2

After the installation, there was one spare device:

[root@rtt6 ~]# mdadm --detail /dev/md0 
/dev/md0:
        Version : 1.1
  Creation Time : Mon Apr  4 02:52:48 2011
     Raid Level : raid1
     Array Size : 2402952 (2.29 GiB 2.46 GB)
  Used Dev Size : 2402952 (2.29 GiB 2.46 GB)
   Raid Devices : 2
  Total Devices : 3
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Apr  4 02:58:53 2011
          State : active
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

           Name : 
           UUID : 9afb610e:50af809f:6e17fcbe:bbde0244
         Events : 32

    Number   Major   Minor   RaidDevice State
       0      94        1        0      active sync   /dev/dasda1
       1      94        5        1      active sync   /dev/dasdb1

       2      94        9        -      spare   /dev/dasdc1

Moving to VERIFIED.

Comment 14 errata-xmlrpc 2011-05-19 12:38:48 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0530.html