Bug 1417255

Summary: mdadm --examine --scan references a non-existent device when creation using --name
Product: Red Hat Enterprise Linux 7 Reporter: John Pittman <jpittman>
Component: mdadmAssignee: Nigel Croxon <ncroxon>
Status: CLOSED NOTABUG QA Contact: guazhang <guazhang>
Severity: low Docs Contact:
Priority: unspecified    
Version: 7.3CC: dledford, jpittman, ncroxon, xni, yizhan
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-28 14:08:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Pittman 2017-01-27 17:14:15 UTC
Description of problem:

When using the --name option of the mdadm command, the results are not consistent with usage expectation of --examine --scan.

Version-Release number of selected component (if applicable):

kernel-3.10.0-514.6.1.el7.x86_64
mdadm-3.4-14.el7_3.1.x86_64

Steps to Reproduce:

[root@localhost ~]# mdadm --create --verbose /dev/md0 --level=0 --metadata=1.2 --raid-devices=2 --name=TEST_MD /dev/sdb /dev/sdc
mdadm: chunk size defaults to 512K
mdadm: array /dev/md0 started.

[root@localhost ~]# cat /proc/mdstat 
Personalities : [raid0] 
md0 : active raid0 sdb[0] sdc[1]
      1046528 blocks super 1.2 512k chunks
      
unused devices: <none>

[root@localhost ~]# mdadm --examine --scan
ARRAY /dev/md/TEST_MD  metadata=1.2 UUID=0ec65407:28a5bcb7:47e40269:3abbcf07 name=localhost.localdomain:TEST_MD
      ^^^^^^^^^^^^^^^
Actual results:

/dev/md/TEST_MD does not exist.  

Expected results:

Unsure.  We shouldn't be referencing a dev that doesn't exist for sure.  In my tests, I did 'ln -s /dev/md0 /dev/md/TEST_MD' and everything seemed to work fine.  Could we just add a linked device here in the /dev/md directory when using --name?

As a note, assembly using --name with no conf file worked fine.

Comment 1 Nigel Croxon 2017-06-21 18:43:24 UTC

The argument "--name" only valid with "--update-subarray" in misc mode.

You are attaching it to a "--create-" command, that is not valid.

Comment 2 John Pittman 2017-07-03 12:56:41 UTC
The man page says it's valid.  From 'man mdadm':

For create, build, or grow:
...
       -N, --name=
              Set a name for the array.  This is currently only effective when
              creating an array with a version-1 superblock, or an array in  a
              DDF  container.  The name is a simple textual string that can be
              used to identify array components when assembling.  If  name  is
              needed  but  not specified, it is taken from the basename of the
              device that is being created.  e.g. when  creating  /dev/md/home
              the name will default to home.

Comment 3 Nigel Croxon 2017-07-13 16:31:06 UTC

/*
 * We need a new md device to assemble/build/create an array.
 * 'dev' is a name given us by the user (command line or mdadm.conf)
 * It might start with /dev or /dev/md any might end with a digit
 * string.
 * If it starts with just /dev, it must be /dev/mdX or /dev/md_dX
 * If it ends with a digit string, then it must be as above, or
 * 'trustworthy' must be 'METADATA' and the 'dev' must be
 *  /dev/md/'name'NN or 'name'NN
 * If it doesn't end with a digit string, it must be /dev/md/'name'
 * or 'name' or must be NULL.
 * If the digit string is present, it gives the minor number to use
 * If not, we choose a high, unused minor number.
 * If the 'dev' is a standard name, it devices whether 'md' or 'mdp'.
 * else if the name is 'd[0-9]+' then we use mdp
 * else if trustworthy is 'METADATA' we use md
 * else the choice depends on 'autof'.
 * If name is NULL it is assumed to match whatever dev provides.
 * If both name and dev are NULL, we choose a name 'mdXX' or 'mdpXX'
 *
 * If 'name' is given, and 'trustworthy' is 'foreign' and name is not
 * supported by 'dev', we add a "_%d" suffix based on the minor number
 * use that.
 *
 * If udev is configured, we create a temporary device, open it, and
 * unlink it.
 * If not, we create the /dev/mdXX device, and if name is usable,
 * /dev/md/name
 * In any case we return /dev/md/name or (if that isn't available)
 * /dev/mdXX in 'chosen'.
 *
 * When we create devices, we use uid/gid/umask from config file.

Comment 4 Nigel Croxon 2017-07-13 19:49:32 UTC
John,

As stated above,   do you have udev configured?

If udev is configured, we create a temporary device, open it, and unlink it.

Comment 5 Nigel Croxon 2017-07-13 20:22:10 UTC
You can eliminate adding the /dev/md0 in your command line.

mdadm --create --verbose /dev/md/TEST_MD --level=0 --metadata=1.2 --raid-devices=2 /dev/sdb /dev/sdc

This creates a symlink /dev/md/TEST_MD to /dev/md127.

Comment 6 John Pittman 2017-07-24 12:58:39 UTC
Nigel,

Thanks for the command, it does work.

However, even if that's true the 'mdadm --examine --scan' references a device path that does not exist, which is the basis for this bug.  My customer was searching for the device and it was not there.

[root@localhost ~]# mdadm --create --verbose /dev/md0 --level=0 --metadata=1.2 --raid-devices=2 --name=TEST_MD /dev/sdb /dev/sdc
mdadm: chunk size defaults to 512K
mdadm: array /dev/md0 started.

[root@localhost ~]# mdadm --examine --scan
ARRAY /dev/md/TEST_MD  metadata=1.2 UUID=4d1b6824:cbf47c38:e6dc34d4:5686c61d name=localhost.localdomain:TEST_MD

'/dev/md/TEST_MD' does not exist.  

If the command usage or structure is incorrect then it should fail or produce a warning.

We can not reference paths that do not exist.  This will point the customer at the wrong path.

Comment 7 Nigel Croxon 2017-07-28 13:39:26 UTC
John,

Lots of testing this week.

1) Reboot - and the symlink will be there.
or
2) stop the device and reassemble with -I
This is how the udev rules do it.
look at /usr/lib/udev/rules.d/

01-md-raid-creating.rules
63-md-raid-arrays.rules
64-md-raid-assembly.rules
65-md-incremental.rules

This is can be closed. There is no code changes needed.

-Nigel