From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040312 Description of problem: mdadm --monitor --scan keeps all raid devices open (presumably in order to catch events?), but this has the ugly side effect that one has to stop mdmonitor in order to be able to stop a raid device that it's monitoring. At least mdadm --stop should somehow tell mdmonitor to let the raid device go, but it does no such thing. Version-Release number of selected component (if applicable): mdadm-1.5.0-3 How reproducible: Always Steps to Reproduce: 1.Create a raid device 2.Restart mdmonitor 3.Try to stop the raid device Actual Results: It's reported as busy Expected Results: It should stop Additional info:
This will need to be worked upstream.
*** Bug 121076 has been marked as a duplicate of this bug. ***
This has been brought to the attention of the upstream maintainer. The maintainer is planning an update to the mdadm package in the near future, and I suspect this will be fixed then.
The fix for this has been identified. I added a patch to prevent a file descriptor leak in the --scan mode of --monitor for mdadm. [root@test dledford]# ls /proc/4805/fd/ -l total 0 lr-x------ 1 root root 64 May 22 16:55 0 -> /dev/null lrwx------ 1 root root 64 May 22 16:55 1 -> /dev/console lrwx------ 1 root root 64 May 22 16:55 2 -> /dev/console lr-x------ 1 root root 64 May 22 16:55 3 -> /etc/mdadm.conf lr-x------ 1 root root 64 May 22 16:55 4 -> /dev/md0 lr-x------ 1 root root 64 May 22 16:55 5 -> /dev/md2 lr-x------ 1 root root 64 May 22 16:55 6 -> /dev/md1 [root@test dledford]# rpm -q mdadm mdadm-1.5.0-3 [root@test dledford]# rpm -Uvh /tmp/mdadm-1.5.0-8.i386.rpm Preparing... ########################################### [100%] 1:mdadm ########################################### [100%] [root@test dledford]# ps axf | grep mdadm 5802 pts/0 S 0:00 \_ grep mdadm 5774 ? S 0:00 mdadm --monitor --scan -f [root@test dledford]# ls /proc/5774/fd/ -l total 0 l--------- 1 root root 64 May 22 16:56 0 -> /dev/null l--------- 1 root root 64 May 22 16:56 1 -> /dev/null l--------- 1 root root 64 May 22 16:56 2 -> /dev/null lr-x------ 1 root root 64 May 22 16:56 3 -> /etc/mdadm.conf [root@test dledford]#
Correction, the 1.5.0-8 tag was already in use. I bumped this one to 1.5.0-9.
testing with RHEL3-U2 AS product for ia64 and ppc, this still fails with mdadm-1.5.0-9: > # mdadm --stop /dev/md0 > mdadm: fail to stop array /dev/md0: Device or resource busy > # ps afxw | grep $(fuser /dev/md0 \ > | awk ' { print $2;}') > 2880 ? S 0:00 mdadm --monitor --scan > # (it does work properly for i386, x86_64, s390, s390x though ...)
Please verify that this isn't a transient error (aka, mdadm --monitor reopens each device once every 15 seconds, checks status, then closes the device IIRC, so this may have just been luck that it hit this before closing the file or something). If it isn't transient, then can you please attach the output of: ls -l /proc/<pid_of_mdadm>/fd/
this appears to have been a transient error ... rebooting the machines and trying the tests again worked without a problem ...
Confirmed fixed in mdadm-1.5.0-10, thanks.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-226.html