Bug 1300579

Summary: Unable to assign hot spare while running IO on Degraded MD Array
Product: Red Hat Enterprise Linux 7 Reporter: Nanda Kishore Chinnaram <nanda_kishore_chinna>
Component: mdadmAssignee: Jes Sorensen <Jes.Sorensen>
Status: CLOSED ERRATA QA Contact: Zhang Yi <yizhan>
Severity: urgent Docs Contact: Milan Navratil <mnavrati>
Priority: unspecified    
Version: 7.2CC: crose, dledford, jshortt, kasmith, linux-bugs, mnavrati, nanda_kishore_chinna, narendra_k, prabhakar_pujeri, sreekanth_reddy, xni, yizhan
Target Milestone: rc   
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard: dell_server dell_mustfix_7.3
Fixed In Version: mdadm-3.4-2.el7 Doc Type: Bug Fix
Doc Text:
Using *mdadm* to assign a hot spare to a degraded array while running I/O operations no longer fails Previously, assigning a hot spare to a degraded array while running I/O operations on the MD Array could fail, and the *mdadm* utility returned error messages such as: mdadm: /dev/md1 has failed so using --add cannot work and might destroy mdadm: data on /dev/sdd1. You should stop the array and re-assemble it A patch has been applied to fix this bug, and adding a hot spare to a degraded array now completes as expected in the described situation.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 00:08:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1273351    
Bug Blocks: 1274397, 1304407, 1313485    

Description Nanda Kishore Chinnaram 2016-01-21 08:27:27 UTC
Description of problem:
A system has 4 drives(sda,sdb,sdc,sdd). A RAID1 array is created using MDADM with partitions sdb1 and sdc1. I/O is started on the Array. The Array is Degraded during I/O. When partition sdd1 is added as hotspare, it is throwing error "/dev/md1 has failed so using --add cannot work and might destroy".

This issue is already fixed. Fix details:- https://github.com/neilbrown/mdadm/commit/d180d2aa2a1770af1ab8520d6362ba331400512f

Version-Release number of selected component (if applicable):
mdadm-3.3.2-7.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create md array by "mdadm -C /dev/md1 --metadata=1.2 -l1 -n2 /dev/sdb1 /dev/sdc1".
2. Wait until resync is completed.
3. Mount the MD Array.
4. Run I/O on the MD Array.
5. Degrade the Array by pulling out the sdb Drive.
6. Add sdd1 as hotspare by "mdadm --manage /dev/md1 --add /dev/sdd1".

Actual results:
Throws the below error  
"mdadm: /dev/md1 has failed so using --add cannot work and might destroy
 mdadm: data on /dev/sdd1. You should stop the array and re-assemble it"

Expected results:
The drive should be added as hotspare successfully.

Additional info:
Kernel Version: 3.10.0-327.el7.x86_64

Comment 2 Jes Sorensen 2016-01-22 18:13:46 UTC
I plan to update to mdadm-3.3.4 for 7.3, which will include this fix.

Comment 3 Jes Sorensen 2016-06-09 15:21:58 UTC
This was resolved via bz#1273351 updating to mdadm-3.4

Comment 5 Nanda Kishore Chinnaram 2016-07-04 16:12:32 UTC
Hi Jes, 
Can you provide access to bz#1273351

Comment 6 Jes Sorensen 2016-07-20 11:29:10 UTC
(In reply to Nanda Kishore Chinnaram from comment #5)
> Hi Jes, 
> Can you provide access to bz#1273351

Nanda,

I cannot add you myself, but I have requested if you can have access to it.

Cheers,
Jes

Comment 7 Nanda Kishore Chinnaram 2016-08-09 22:14:26 UTC
Verified the issue in RHEL 7.3 Alpha1 Build. It's resolved.

Comment 8 Zhang Yi 2016-08-17 09:14:10 UTC
Pass regression test with [1], patch from comment 1 exist on mdadm-3.4-10.el7.
change to VERIFIED.

[1]
kernel-3.10.0-489.el7.x86_64.rpm 
mdadm-3.4-9.el7.x86_64.rpm 

Thanks
Yi

Comment 10 errata-xmlrpc 2016-11-04 00:08:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2182.html