Bug 650803 - mdadm locking is broken
Summary: mdadm locking is broken
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: mdadm
Version: 14
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-08 05:10 UTC by Maciej Żenczykowski
Modified: 2011-03-07 04:34 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-07 04:34:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Maciej Żenczykowski 2010-11-08 05:10:03 UTC
# rpm -q mdadm
mdadm-3.1.3-0.git20100722.2.fc14.x86_64

mdadm locking appears to be *badly* broken (I think there's some sort of race in the creation (deletion?) of the two lock files).

(a) make sure the locks aren't there:

# rm -f /var/run/mdadm.map.lock /dev/md/md-device-map.lock 
# ls -al /var/run/mdadm.map.lock /dev/md/md-device-map.lock 
ls: cannot access /var/run/mdadm.map.lock: No such file or directory
ls: cannot access /dev/md/md-device-map.lock: No such file or directory

(b) insert external firewire drive used for backups, has 3 partitions which should be auto-added to 3 existing software mirror raids on internal drive.

This results in the simultaneous spawning of:
- /sbin/mdadm -I /dev/sdb2
- /sbin/mdadm -I /dev/sdb3
- /sbin/mdadm -I /dev/sdb4

(c) Two of these succeed.  The third one spins.  When spinning (one cpu at 100%):

# ls -al /var/run/mdadm.map.lock /dev/md/md-device-map.lock 
-rw-------. 1 root root 0 Nov  7 21:03 /dev/md/md-device-map.lock
-rw-------. 1 root root 0 Nov  7 21:03 /var/run/mdadm.map.lock

...
mkdir("/dev/md", 0755)                  = -1 EEXIST (File exists)
open("/dev/md/md-device-map.lock", O_RDWR|O_CREAT|O_EXCL, 0600) = -1 EEXIST (File exists)
open("/var/run/mdadm.map.lock", O_RDWR|O_CREAT|O_EXCL, 0600) = -1 EEXIST (File exists)
mkdir("/dev/md", 0755)                  = -1 EEXIST (File exists)
...and again...

# ps auxww | egrep mdadm
root 5760 95.3 0.0 8896 824 ? R 21:03 1:18 /sbin/mdadm -I /dev/sdb3


Nothing short of killing the spinner or deleting the lockfiles 'fixes' it.

# rm -f /var/run/mdadm.map.lock /dev/md/md-device-map.lock 

However even after deleting the lockfiles, when the previously spinning process (mdadm -I /dev/sdb3) finishes, a lockfile is still left over:

# ls -al /var/run/mdadm.map.lock /dev/md/md-device-map.lock 
ls: cannot access /dev/md/md-device-map.lock: No such file or directory
-rw-------. 1 root root 0 Nov  7 21:06 /var/run/mdadm.map.lock


Side note: Could this perhaps be somehow related to the presence of another software raid with partitions???

# ls -al /dev/md
lrwxrwxrwx.  1 root root    8 Nov  7 20:09 NIKE:win -> ../md127
lrwxrwxrwx.  1 root root   10 Nov  7 20:10 NIKE:win1 -> ../md127p1

Comment 1 Maciej Żenczykowski 2010-11-08 05:18:18 UTC
Possibly related to bug 621524

Comment 2 Maciej Żenczykowski 2010-11-08 05:22:50 UTC
This does appear to be fixed by:

sudo rpm -hvU /home/maze/Download/mdadm-3.1.3-0.git20100804.2.fc14.x86_64.rpm 

Although the behaviour has changed.

# mdadm -I /dev/sdb3
mdadm: not adding /dev/sdb3 to active array (without --run) /dev/md0

Comment 3 Maciej Żenczykowski 2011-03-07 04:34:23 UTC
Fixed in Fedora 14 at mdadm-3.1.3-0.git20100804.2.fc14.x86_64


Note You need to log in before you can comment on or make changes to this bug.