Bug 429604

Summary: [PATCH] udev rules to automatically assemble devices
Product: [Fedora] Fedora Reporter: Bill Nottingham <notting>
Component: mdadmAssignee: Doug Ledford <dledford>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: rawhideCC: bruno, davidz, dcantrell, harald, kay.sievers, rvokal
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: NeedsRetesting
Fixed In Version: 2.6.4-4.fc9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-22 14:05:48 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 235706    
Attachments:
Description Flags
rules that incrementally assemble arrays on insertion none

Description Bill Nottingham 2008-01-21 16:52:35 EST
Description of problem:

The attached udev rules (70-mdadm.rules) allow for automatic assembly of RAID
devices on insertion. It means we can get rid of always running mdadm on boot.

Not sure whether these should go in mdadm or udev proper. I suppose
udev already has 64-md-raid.rules, but I could see packaging these rules with
mdadm (it's not like they're going to work without it.)

Version-Release number of selected component (if applicable):

udev-118-1.fc9
mdadm-2.6.4-1.fc8
Comment 1 Bill Nottingham 2008-01-21 16:52:35 EST
Created attachment 292417 [details]
rules that incrementally assemble arrays on insertion
Comment 2 Harald Hoyer 2008-01-22 05:27:41 EST
mdadm would be fine, so that I can remove 64-md-raid.rules
Comment 3 Bill Nottingham 2008-01-22 11:31:28 EST
64-md-raid.rules is different; that's for persistent paths. These rules are for
actually getting them activated.
Comment 4 Bill Nottingham 2008-02-01 17:04:33 EST
Added in 2.6.4-3.fc9.
Comment 5 David Zeuthen 2008-02-23 18:37:54 EST
This is a nice approach. Unfortunately it's not working for me on a fully
updated rawhide system; I have to manually run 'mdadm --assemble --scan' to
start my array (using the old config file). 

What log files / information do you need?
Comment 6 David Zeuthen 2008-02-23 18:49:22 EST
Here's some info

My old config file

[root@hook ~]# cat /etc/mdadm.conf
DEVICE /dev/disk/by-id/scsi-SATA_WDC_WD5000AAJS-_WD-WCAPW0295169
DEVICE /dev/disk/by-id/scsi-SATA_WDC_WD5000AAJS-_WD-WCAPW0594845
DEVICE /dev/disk/by-id/scsi-SATA_WDC_WD5000AAKS-_WD-WCAPW0493929
DEVICE /dev/disk/by-id/scsi-SATA_WDC_WD5000KS-00_WD-WCANU1914880
DEVICE /dev/disk/by-id/scsi-SATA_WDC_WD5000KS-00_WD-WCANU2209762

ARRAY /dev/Fusion500P uuid=c5fc394c:f33279ab:7630b1ba:099a5109 auto=md

These are unpartitioned disks

[root@hook ~]# ls -l /dev/disk/by-id/scsi-SATA_WDC_WD5000*
lrwxrwxrwx 1 root root 9 2008-02-23 17:44
/dev/disk/by-id/scsi-SATA_WDC_WD5000AAJS-_WD-WCAPW0295169 -> ../../sdb
lrwxrwxrwx 1 root root 9 2008-02-23 17:44
/dev/disk/by-id/scsi-SATA_WDC_WD5000AAJS-_WD-WCAPW0594845 -> ../../sdi
lrwxrwxrwx 1 root root 9 2008-02-23 17:44
/dev/disk/by-id/scsi-SATA_WDC_WD5000AAKS-_WD-WCAPW0493929 -> ../../sdk
lrwxrwxrwx 1 root root 9 2008-02-23 17:44
/dev/disk/by-id/scsi-SATA_WDC_WD5000KS-00_WD-WCANU1914880 -> ../../sdh
lrwxrwxrwx 1 root root 9 2008-02-23 17:44
/dev/disk/by-id/scsi-SATA_WDC_WD5000KS-00_WD-WCANU2209762 -> ../../sdj

That all all raid members, here's one of them

[root@hook ~]# /lib/udev/vol_id /dev/sdk 
ID_FS_USAGE=raid
ID_FS_TYPE=linux_raid_member
ID_FS_VERSION=0.90.0
ID_FS_UUID=c5fc394c:f33279ab:7630b1ba:099a5109
ID_FS_UUID_ENC=c5fc394c:f33279ab:7630b1ba:099a5109
ID_FS_LABEL=
ID_FS_LABEL_ENC=
ID_FS_LABEL_SAFE=

However

[root@hook ~]# udevinfo --query all --name /dev/sdk
P: /devices/pci0000:00/0000:00:09.0/host9/target9:0:0/9:0:0:0/block/sdk
N: sdk
S: disk/by-id/scsi-SATA_WDC_WD5000AAKS-_WD-WCAPW0493929
S: disk/by-id/ata-WDC_WD5000AAKS-00TMA0_WD-WCAPW0493929
S: disk/by-path/pci-0000:00:09.0-scsi-3:0:0:0
E: ID_VENDOR=ATA
E: ID_MODEL=WDC_WD5000AAKS-0
E: ID_REVISION=12.0
E: ID_SERIAL=SATA_WDC_WD5000AAKS-_WD-WCAPW0493929
E: ID_SERIAL_SHORT=WD-WCAPW0493929
E: ID_TYPE=disk
E: ID_BUS=scsi
E: ID_ATA_COMPAT=WDC_WD5000AAKS-00TMA0_WD-WCAPW0493929
E: ID_PATH=pci-0000:00:09.0-scsi-3:0:0:0

So it seems like udev isn't running vol_id on unpartitioned disks. That's
probably a bug. Adding upstream udev maintainer to the Cc. Kay?
Comment 7 David Zeuthen 2008-02-23 18:53:31 EST
Also, two things

 1. the homehost feature in mdadm is pretty broken - some attempt to protect the
user against himself; As such, we should disable it as it will break hotplugging
arrays. With hotpluggable buses such as USB, Firewire, eSATA this is important.
Myself? I have a 5-disk tower using RAID5 that I happily plug into different
machines all the time. With /etc/mdadm.conf this worked great (using /dev/disk/*
persistant symlinks), now it's broken.

 2. at the same time we need a way for e.g. a forensic live cd (one used to
inspect a system without touching it) to disable things like autoassembly. There
should probably be a boot option e.g. 'noassembly'. 70-mdadm.rules should then
check this and avoid assembly if it's set. Similar, in the future LVM assembly
should check the same variable.
Comment 8 Kay Sievers 2008-02-24 10:56:10 EST
True:
  ENV{DEVTYPE}=="partition", IMPORT{program}="vol_id --export $tempnode"
We can't run it, because of missing events for media changes.

The next kernel should have it:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=285e9670d91cdeb6b6693729950339cb45410fdc

but during the merge, seems something broke it, at least I don't see events the
now. I need to find out, how fix it. Then we can call vol_id, and properly
update the values/symlinks at media changes.
Comment 9 David Zeuthen 2008-02-24 15:19:36 EST
(In reply to comment #8)
> True:
>   ENV{DEVTYPE}=="partition", IMPORT{program}="vol_id --export $tempnode"
> We can't run it, because of missing events for media changes.

Presumably we can run this if removable==0 yes? Might be a good interim fix, at
least as a vendor patch.

> The next kernel should have it:
>
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=285e9670d91cdeb6b6693729950339cb45410fdc
> 
> but during the merge, seems something broke it, at least I don't see events the
> now. I need to find out, how fix it. Then we can call vol_id, and properly
> update the values/symlinks at media changes.

Sounds good this will get fixed properly in the shiny future. Thanks.
Comment 10 Bill Nottingham 2008-02-25 12:11:26 EST
(In reply to comment #7)
> Also, two things
> 
>  1. the homehost feature in mdadm is pretty broken - some attempt to protect the
> user against himself; As such, we should disable it as it will break hotplugging
> arrays. With hotpluggable buses such as USB, Firewire, eSATA this is important.
> Myself? I have a 5-disk tower using RAID5 that I happily plug into different
> machines all the time. With /etc/mdadm.conf this worked great (using /dev/disk/*
> persistant symlinks), now it's broken.

Is the conf file still around when this is failing?

>  2. at the same time we need a way for e.g. a forensic live cd (one used to
> inspect a system without touching it) to disable things like autoassembly. There
> should probably be a boot option e.g. 'noassembly'. 70-mdadm.rules should then
> check this and avoid assembly if it's set. Similar, in the future LVM assembly
> should check the same variable.

If the livecd doesn't have a mdadm.conf, I don't *think* anything will be assembled.
Comment 11 David Zeuthen 2008-02-25 19:03:05 EST
(In reply to comment #10)
> Is the conf file still around when this is failing?

It fails both when the config file is present and when it's gone.

> If the livecd doesn't have a mdadm.conf, I don't *think* anything will be
assembled.

The point is that we want hotplug of RAID arrays to Just Work(tm).
Comment 12 Bill Nottingham 2008-02-25 20:23:36 EST
The discussion with upstream (which admittedly happened outside of this bug) was
that they did not want auto-assembly of arrays that were not related at least in
some way to the system (whether via config file or homehost (but not both)).

This is due to concerns they have about 1) array name collisions (what happens
when you have multiple arrays whose superblocks claim to be md0) 2) fiber
devices on a SAN which could be seen by mutliple machines.

#1 probably needs tested to see what we do now. #2 seems like it's a bug if you
configure your SAN that way.
Comment 13 David Zeuthen 2008-03-17 22:42:37 EDT
So I partitioned the components of my array. Now I'm running into this.

# /sbin/mdadm --incremental /dev/sdg1 
mdadm: failed to open /dev/md/d0: No such file or directory.

So I played around

 # mkdir /dev/md
 # ln -s /dev/md0 /dev/md/d0
 # ls -l /dev/md/d0 
 lrwxrwxrwx 1 root root 8 2008-03-17 22:36 /dev/md/d0 -> /dev/md0
 # /sbin/mdadm --incremental /dev/sdb1
 mdadm: failed to open /dev/md/d0: File exists.

Hmm...

 # rm -f /dev/md/d0

And now

 # /sbin/mdadm --incremental /dev/sdb1
 mdadm: /dev/sdb1 attached to /dev/md/d0, not enough to start (1).
 # /sbin/mdadm --incremental /dev/sdg1
 mdadm: /dev/sdg1 attached to /dev/md/d0, not enough to start (2).
 # /sbin/mdadm --incremental /dev/sdh1
 mdadm: /dev/sdh1 attached to /dev/md/d0, not enough to start (3).
 # /sbin/mdadm --incremental /dev/sdi1
 mdadm: /dev/sdi1 attached to /dev/md/d0, not enough to start safely.
 # /sbin/mdadm --incremental /dev/sdj1
 mdadm: /dev/sdj1 attached to /dev/md/d0, which has been started.

So mdadm needs the directory /dev/md to exist. It needs to be empty too.

Btw.

 # ls -l /dev/md/d0 
 brw------- 1 root root 254, 0 2008-03-17 22:37 /dev/md/d0
 # udevinfo -q all --name /dev/md/d0 
 node name not found

Sigh. It's 2008 and we still have things like mdadm creating it's own device
nodes. That's just not maintainable. (Of course ironically it refuses to create
_directories_).

 # rpm -q udev mdadm
 udev-118-5.fc9.x86_64
 mdadm-2.6.4-3.fc9.x86_64
Comment 14 David Zeuthen 2008-03-18 11:58:51 EDT
Talking to notting on IRC we decided the course of action right now is to patch
mdadm so it creates the /dev/md/ directory if it's not there. Doug, can you look
into this? Thanks.
Comment 15 Jesse Keating 2008-04-01 16:33:19 EDT
I'm a bit confused as to what the status is here.  Is the feature in, but
broken, or is the feature not in yet?  Should we have a new bug for the
brokenness and close thins one out?  Help me out here guys, time on Fedora 9 is
ticking away.
Comment 16 David Zeuthen 2008-04-01 17:02:47 EDT
Yes, someone needs to spend a few seconds making the mdadm package provide the
/lib/udev/devices/dm directory. Then this bug will go away.
Comment 17 Bill Nottingham 2008-04-01 17:10:44 EDT
I'm confused. Why that directory instead of /dev/md?
Comment 18 David Zeuthen 2008-04-01 17:21:01 EDT
Oh, I meant /lib/udev/devices/md .. md.. not dm. I always mix it up.
Comment 19 David Zeuthen 2008-04-04 14:07:07 EDT
(In reply to comment #16)
> Yes, someone needs to spend a few seconds making the mdadm package provide the
> /lib/udev/devices/dm directory. Then this bug will go away.

Of course /sbin/start_udev will start complaining that this is "deprecated" but
see bug 440962. Can we add this? Thanks.
Comment 20 Bill Nottingham 2008-04-16 19:18:06 EDT
Please try the packages from:
http://koji.fedoraproject.org/koji/taskinfo?taskID=569782

Comment 21 Bill Nottingham 2008-04-17 12:28:02 EDT
Built as 2.6.4-4.fc9.