Description of problem: After a failed attempt to add a device to an active array /dev/md/md-device-map.lock is left dangling. Later executed mdadm processes will then go into 100% CPU spinning waiting to the lock to free. A reproducer: for i in {0..2}; do dd if=/dev/zero of=/tmp/testmd$i bs=64K count=2048 losetup /dev/loop$i /tmp/testmd$i done mdadm --create /dev/md/testraid --level=5 --metadata=0.90 --raid-devices=3 /dev/loop{0..2} # give it no time to resync: mdadm --stop /dev/md/testraid # now try incremental reassembling: for i in {0..2}; do mdadm -I /dev/loop$i ls -l /dev/md/*.lock done Version-Release number of selected component (if applicable): mdadm-3.1.3-0.git20100722.2.fc13.x86_64 How reproducible: always Steps to Reproduce: 1. Run the reproducer script Actual results: mdadm: array /dev/md/testraid started. mdadm: stopped /dev/md/testraid mdadm: /dev/loop0 attached to /dev/md/127, not enough to start (1). ls: cannot access /dev/md/*.lock: No such file or directory mdadm: /dev/loop1 attached to /dev/md/127, which has been started. ls: cannot access /dev/md/*.lock: No such file or directory mdadm: not adding /dev/loop2 to active array (without --run) /dev/md/127 -rw-------. 1 root root 0 Aug 5 13:15 /dev/md/md-device-map.lock After the failed attempt to add /dev/loop2 the lock file was left. Expected results: The lock file must be gone after mdadm exits.
This is a known issue fixed in the mdadm-3.1.3-0.git20100804.1 and later builds. A push of this later build to testing is forthcoming.
mdadm-3.1.3-0.git20100804.2.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc13
mdadm-3.1.3-0.git20100804.2.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc12
mdadm-3.1.3-0.git20100804.2.fc14 has been submitted as an update for Fedora 14. http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc14
With mdadm-3.1.3-0.git20100804.2.fc13 I can still see the file /dev/md/md-device-map.lock is present after the test is over. But now it does not prevent a follow-up "mdadm -S /dev/md127" from completing successfully (and deleting the lock file afterwards). Not sure if this is exactly the expected behaviour, but it is usable.
It is the expected behaviour. There is nothing we can do about a dangling lock file on an interrupted command (think a segv or similar, the lock file will get left no matter whether we have a signal handler that should clean it up or not as on fatal errors like that the signal handler is never run). So, subsequent runs must be able to deal with a dangling lock. The new code does exactly that.
mdadm-3.1.3-0.git20100804.2.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update mdadm'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc12
mdadm-3.1.3-0.git20100804.2.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update mdadm'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc13
mdadm-3.1.3-0.git20100804.2.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update mdadm'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/mdadm-3.1.3-0.git20100804.2.fc14
Anaconda when installing from a USB live image seems to trigger this behavior when looking for storage devices. When installing to a system which has a 3 disk RAID 5 array (0.90 MD on disk format), I had to kill the mdadm process manually to get the installer to continue (the raid array contains data only so it isn't needed for booting at all). I updated to mdadm-3.1.3-0.git20100804.2.fc14 post-install and it seemed to fix all issues I had post-install (mostly related to bz 650803 I believe).
mdadm-3.1.3-0.git20100804.2.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.
mdadm-3.1.3-0.git20100804.2.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.