From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031023 Description of problem: mdadm fails to ignore the Event: line present in the kernel's /proc/mdstat output, that looks like this: Personalities : [raid1] read_ahead 1024 sectors Event: 10 md11 : active raid1 [dev 08:01][2] hdc1[0] The Event: line seems to be a Red Hat Enterprise Linux-specific line, because I can't see it in other kernels on other OSs such as Red Hat Linux or Fedora Core. If raid monitoring is enabled, this line will be printed to the tty in which the mdmonitor service was started every few minutes. Version-Release number of selected component (if applicable): mdadm-1.0.1-1 kernel-2.4.21-4.EL How reproducible: Always Steps to Reproduce: 1.service start mdmonitor Actual Results: Starting mdmonitor: mdadm: bad /proc/mdstat line starts: Event: Expected Results: This line should probably be ignored, although it could be used to skip re-checking if it hasn't changed since the last check. Additional info:
Seeing the same problem which makes monitoring less than helpful. I just checked the mdadm-1.4.0 code and do not note that there is anything there to handle this Event line. Also, /usr/sbin/handle-mdadm-events is shown in the example mdadm.conf, but there is no such program. --Larry
The Combination of kernel-2.4.21-6.EL and mdadm-1.4.0-1 as found in the RHEL3-Update-Beta1 channel still seems to exhibit this bug. I am particularly worried that this has escaped notice somewhere because the mdadm-1.4.0 build date is 17Nov03, roughly 3 weeks after this bug has been filed. This may slip RH9 QA for this problem happens only when running on the enterprise kernel (standard Red Hat Kernels do not have the Event: line) I would suggest SEVERITY should be set to HIGH because mdmonitor does NOT WORK AT ALL due to this problem.
The problem still persists with kernel-2.4.21-9.EL and mdadm-1.4.0-1 from Quarterly Update #1.
Slight correction to Mario Lorenz: mdadm --monitor *does* work in spite of this warning. Basically, this is an annoying cosmetic bug, it does *not* keep things from working. Case in point: [root@dledford root]# cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors Event: 52 md1 : active raid5 sdf1[2] sde1[3] sdg1[1] sdd1[4] sdc1[0] 1638144 blocks level 5, 64k chunk, algorithm 0 [5/5] [UUUUU] md0 : active raid5 sde2[7] sdf2[4] sdg2[3] sdd2[6] sdc2[2] sdb2[1] sda2[0] 104196864 blocks level 5, 64k chunk, algorithm 0 [7/6] [UUUUU_U] [===>.................] recovery = 17.0% (2966244/17366144) finish=119.2min speed=2011K/sec unused devices: <none> [root@dledford root]# Notice the mdadm rpm version and the presence of the Event line in the /proc/mdstat file. Here's the email I got from mdadm this morning: From: mdadm monitoring <root@dledford> To: dledford Subject: Fail event on /dev/md0:dledford Date: Mon, 23 Feb 2004 05:47:57 -0500 This is an automatically generated mail message from mdadm running on dledford A Fail event had been detected on md device /dev/md0. It could be related to component device /dev/sde2. Faithfully yours, etc. So, just to set everyone at ease, this is *not* a functional problem, just cosmetic, so priority doesn't need to be HIGH. As far as the Event line is concerned, that may only be in Red Hat kernels in the 2.4 kernel series, but it's also in 2.6 kernels. It was an upstream change that came from the md code maintainer. In any case, mdadm-1.5.0-1 (which solves the Event issue) has been built. I'll submit it for possible inclusion in the next update. If it doesn't go through, then I'll make it available elsewhere.
In case anybody interested I made own build of version 1.5.0. This build is based on RHEL3 errata (mdadm-1.4.0 with a patch with a fix for a problem with recovery thread sleeping in mdmpd): ftp://ftp.vslib.cz/pub/local/milan.kerslager/RHEL-3/RPMS/ The release number is zero to allow regular update of this package by RH's mdadm-1.5.0-1 (their next possible update through RHN).
I stand corrected. Yes, it does indeed work, provided mdadm.conf has the correct devices in there, and not some /dev/loop's I used for some earlier tests....
Hmm, my issue is different then. mdadm fails in cases where mdadm.conf is not necessary. It works on RH9 and FC1, but not RHEL3. Hmm...
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-201.html