Bug 171938

Summary: mdmonitor doesn't monitor raid 4 arrays
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: mdadmAssignee: Doug Ledford <dledford>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: rawhide   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.2-4.fc7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-20 19:35:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 158504    

Description Alexandre Oliva 2005-10-27 22:27:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8b5) Gecko/20051008 Fedora/1.5-0.5.0.beta2 Firefox/1.4.1

Description of problem:
mdmonitor incessantly polls raid 4 arrays that, according to /proc/mdstat, are not in need of any attention whatsoever.  If left alone, it eats all available cpu issuing so many syscalls, except for the cpu it forces the md kernel threads to use to respond to its ioctls.

Version-Release number of selected component (if applicable):
mdadm-1.11.0-4.fc4

How reproducible:
Always

Steps to Reproduce:
1.Create a raid 4 array
2.Add it to /etc/mdadm.conf
3.Start mdmonitor


Actual Results:  It eats all cpu.  strace shows it's repeatedly running:

open("/dev/md17", O_RDONLY)             = 3
ioctl(3, 0x80480911, 0xbfcafb50)        = 0
close(3)                                = 0

If there are multiple raid 4 arrays, it will cycle over them all, also incessantly.

Expected Results:  It shouldn't keep polling arrays that are fine.

Additional info:

Comment 1 Doug Ledford 2005-10-31 20:17:50 UTC
This is most likely related to the raid arrays being level 4.  That's a rarely
used and mostly untested level as far as things like mdadm monitoring are
concerned.  I'll probably upgrade to the latest mdadm and see if it still
happens then.

Comment 2 Christian Iseli 2007-01-22 10:14:08 UTC
This report targets the FC3 or FC4 products, which have now been EOL'd.

Could you please check that it still applies to a current Fedora release, and
either update the target product or close it ?

Thanks.

Comment 3 Doug Ledford 2007-07-03 16:37:02 UTC
The eating all cpu time thing should be solved in later versions, however, that
doesn't mean raid4 devices were properly monitored.  Investigating this bug
report lead me to the fact that raid4 devices in the current code were actually
ignored entirely.  I've modified the current mdadm to no longer ignore raid4
devices (and in fact it would ignore raid10 and raid6 devices as well).  With as
many redundant levels as there are now, it's become much safer to configure
mdadm's monitor setup to ignore non-redundant raid types than it is to
specifically select the redundant types.  This correction will show up in
mdadm-2.6.2-2 or later.

Comment 4 Fedora Update System 2007-07-05 19:12:08 UTC
mdadm-2.6.2-2.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 5 Fedora Update System 2007-07-09 15:47:48 UTC
mdadm-2.6.2-3.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 6 Fedora Update System 2007-07-10 06:42:17 UTC
mdadm-2.6.2-4.fc7 has been pushed to the Fedora 7 testing repository.  If problems still persist, please make note of it in this bug report.

Comment 7 Fedora Update System 2007-07-20 19:35:19 UTC
mdadm-2.6.2-4.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.