108580 – mdadm --monitor does not work

Bug 108580 - mdadm --monitor does not work

Summary: mdadm --monitor does not work

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	mdadm
Sub Component:
Version:	9
Hardware:	i686
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	David Lawrence
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-10-30 11:21 UTC by Need Real Name
Modified:	2006-03-11 04:05 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-05-22 21:42:15 UTC
Embargoed:

Attachments	(Terms of Use)

Description Need Real Name 2003-10-30 11:21:15 UTC

mdadm --monitor does not work in any way.

mdadm --monitor does not mail or run a program on status changes in md-arrays.

No difference if started manually or by /etc/init.d/mdmonitor

Neither the MAILADDR nor PROGRAM option in mdadm.conf work.

Reproducibility
Every time.

Steps to reproduce
1. Make md-array
2. add MAILADDR root to /etc/mdadm.conf
3. /etc/init.d/mdmonitor start
4. offline one disk in md-array.

Actual results
Nothing.

Expected Results 
Mail should be sent alerting admin to status change.

Comment 1 Need Real Name 2003-10-30 11:22:42 UTC

Last tested configuration (have tested several):
---------------------------------
$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hde1[0] hdg1[1]
      180939968 blocks [2/2] [UU]

unused devices: <none>
------------------------------
Example of failure not reported by mdadm --monitor :
------------------------------
Oct 17 13:23:23 xxx smartd: Device: /dev/hdg, S.M.A.R.T. Attribute: 1 Changed -2
Oct 17 13:53:23 xxx smartd: Device: /dev/hde, S.M.A.R.T. Attribute: 1 Changed 1
Oct 17 13:53:24 xxx smartd: Device: /dev/hdg, S.M.A.R.T. Attribute: 1 Changed 2
Oct 17 13:58:32 xxx kernel: hde: dma_intr: status=0x51 { DriveReady SeekComplete
Error }
Oct 17 13:58:32 xxx kernel: hde: dma_intr: error=0x40 { UncorrectableError },
LBAsect=10458, high=0, low=10458, sector=10392
Oct 17 13:58:32 xxx kernel: end_request: I/O error, dev 21:01 (hde), sector 10392
Oct 17 13:58:32 xxx kernel: raid1: Disk failure on hde1, disabling device.
Oct 17 13:58:32 xxx kernel: ^IOperation continuing on 1 devices
Oct 17 13:58:32 xxx kernel: raid1: hde1: rescheduling block 10392
Oct 17 13:58:32 xxx kernel: md: updating md0 RAID superblock on device
Oct 17 13:58:32 xxx kernel: md: (skipping faulty hde1 )
Oct 17 13:58:32 xxx kernel: md: hdg1 [events: 0000004a]<6>(write) hdg1's sb
offset: 180939968
Oct 17 13:58:32 xxx kernel: md: recovery thread got woken up ...
Oct 17 13:58:32 xxx kernel: md0: no spare disk to reconstruct array! --
continuing in degraded mode
Oct 17 13:58:32 xxx kernel: raid1: hdg1: redirecting sector 10392 to another mirror

Comment 2 Christian Hofmann 2004-01-05 12:30:12 UTC

When will this be fixed? Three open bugs for mdadm for two month... 
thats not quite the enterprise support I expect...

Comment 3 Arjan van de Ven 2004-01-05 12:34:11 UTC

Christian Hofmann:
Bugzilla is not a support mechanism. If you need or expect support you
should use RH support not bugzilla.

Comment 4 Suzanne Hillman 2004-02-11 20:13:05 UTC

Christian - did you end up calling support? Do you need more
information on how to do that?

Comment 5 Doug Ledford 2004-02-23 14:18:18 UTC

mdadm --monitor is working fine here.  Current version is 1.4.0-1 so
make sure you are up to date with current version and then make sure
that you are running some sort of smtpdaemon so that the program can
successfully email the person listed in the MAILADDR line.

Comment 6 Doug Ledford 2004-05-22 21:42:15 UTC

Closing this bug since mdadm --monitor works in recent versions.

Note You need to log in before you can comment on or make changes to this bug.