Bug 621524 - dangling md-device-map.lock
Summary: dangling md-device-map.lock
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: mdadm   
(Show other bugs)
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-05 11:20 UTC by Michal Schmidt
Modified: 2010-12-07 20:14 UTC (History)
1 user (show)

Fixed In Version: mdadm-3.1.3-0.git20100804.2.fc13
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-12-07 20:12:46 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description Michal Schmidt 2010-08-05 11:20:29 UTC
Description of problem:
After a failed attempt to add a device to an active array /dev/md/md-device-map.lock is left dangling. Later executed mdadm processes will then go into 100% CPU spinning waiting to the lock to free.

A reproducer:

for i in {0..2}; do
        dd if=/dev/zero of=/tmp/testmd$i bs=64K count=2048
        losetup /dev/loop$i /tmp/testmd$i
done
mdadm --create /dev/md/testraid --level=5 --metadata=0.90 --raid-devices=3 /dev/loop{0..2}
# give it no time to resync:
mdadm --stop /dev/md/testraid
# now try incremental reassembling:
for i in {0..2}; do
        mdadm -I /dev/loop$i
        ls -l /dev/md/*.lock
done


Version-Release number of selected component (if applicable):
mdadm-3.1.3-0.git20100722.2.fc13.x86_64

How reproducible:
always

Steps to Reproduce:
1. Run the reproducer script
  
Actual results:
mdadm: array /dev/md/testraid started.
mdadm: stopped /dev/md/testraid
mdadm: /dev/loop0 attached to /dev/md/127, not enough to start (1).
ls: cannot access /dev/md/*.lock: No such file or directory
mdadm: /dev/loop1 attached to /dev/md/127, which has been started.
ls: cannot access /dev/md/*.lock: No such file or directory
mdadm: not adding /dev/loop2 to active array (without --run) /dev/md/127
-rw-------. 1 root root 0 Aug  5 13:15 /dev/md/md-device-map.lock

After the failed attempt to add /dev/loop2 the lock file was left.

Expected results:
The lock file must be gone after mdadm exits.

Comment 1 Doug Ledford 2010-08-05 14:16:06 UTC
This is a known issue fixed in the mdadm-3.1.3-0.git20100804.1 and later builds.  A push of this later build to testing is forthcoming.

Comment 2 Fedora Update System 2010-08-05 14:25:19 UTC
mdadm-3.1.3-0.git20100804.2.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc13

Comment 3 Fedora Update System 2010-08-05 14:25:55 UTC
mdadm-3.1.3-0.git20100804.2.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc12

Comment 4 Fedora Update System 2010-08-05 14:26:34 UTC
mdadm-3.1.3-0.git20100804.2.fc14 has been submitted as an update for Fedora 14.
http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc14

Comment 5 Michal Schmidt 2010-08-05 15:01:22 UTC
With mdadm-3.1.3-0.git20100804.2.fc13 I can still see the file /dev/md/md-device-map.lock is present after the test is over.
But now it does not prevent a follow-up "mdadm -S /dev/md127" from completing successfully (and deleting the lock file afterwards).

Not sure if this is exactly the expected behaviour, but it is usable.

Comment 6 Doug Ledford 2010-08-05 15:21:33 UTC
It is the expected behaviour.  There is nothing we can do about a dangling lock file on an interrupted command (think a segv or similar, the lock file will get left no matter whether we have a signal handler that should clean it up or not as on fatal errors like that the signal handler is never run).  So, subsequent runs must be able to deal with a dangling lock.  The new code does exactly that.

Comment 7 Fedora Update System 2010-08-05 23:29:33 UTC
mdadm-3.1.3-0.git20100804.2.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc12

Comment 8 Fedora Update System 2010-08-05 23:53:05 UTC
mdadm-3.1.3-0.git20100804.2.fc13 has been pushed to the Fedora 13 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc13

Comment 9 Fedora Update System 2010-08-10 01:30:09 UTC
mdadm-3.1.3-0.git20100804.2.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update mdadm'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc14

Comment 10 Mike Gahagan 2010-11-22 21:55:33 UTC
Anaconda when installing from a USB live image seems to trigger this behavior when looking for storage devices. When installing to a system which has a 3 disk RAID 5 array (0.90 MD on disk format), I had to kill the mdadm process manually to get the installer to continue (the raid array contains data only so it isn't needed for booting at all).


I updated to mdadm-3.1.3-0.git20100804.2.fc14 post-install and it seemed to fix all issues I had post-install (mostly related to bz 650803 I believe).

Comment 11 Fedora Update System 2010-12-07 20:12:04 UTC
mdadm-3.1.3-0.git20100804.2.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2010-12-07 20:14:01 UTC
mdadm-3.1.3-0.git20100804.2.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.