Bug 166541 - mdadm --grow infinite resync
mdadm --grow infinite resync
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Doug Ledford
Brian Brock
:
Depends On:
Blocks: 181409 185624
  Show dependency treegraph
 
Reported: 2005-08-23 01:00 EDT by Wendy Cheng
Modified: 2007-11-30 17:07 EST (History)
3 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 17:15:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
upstream patch that fixes the issue. (454 bytes, patch)
2005-08-23 01:00 EDT, Wendy Cheng
no flags Details | Diff

  None (edit)
Description Wendy Cheng 2005-08-23 01:00:03 EDT
Description of problem:

Customer reports "mdadm --grow" command goes into an infinite loop of resync.

Version-Release number of selected component (if applicable):
2.6.9-16.EL

Steps to Reproduce:
0. Create 4 partitions, say sdc1, sdc2, sdc7, sdc8, where sdc7 and sdc8 are
larger than sdc1 and sdc2 (customer uses 53 GB vs. 19GB, I use 800MB vs. 1.5GB). 
1. mdadm -Cv /dev/md7 -l1 -n2 /dev/sdc1 /dev/sdc2 
2. mkfs.ext3 /dev/md7
3. mkdir /mnt/tmp 
4. mount /dev/md7 /mnt/tmp
5. mdadm /dev/md7 -f /dev/sdc1 (fail the device)
6. mdadm /dev/md7 -r /dev/sdc1 (remove the device)
7. mdadm /dev/md7 -a /dev/sdc7 (mirror to the bigger device, wait for sync to
complete)
8. mdadm /dev/md7 -f /dev/sdc2 (fail device)
9. mdadm /dev/md7 -r /dev/sdc2 (remove device)
10. mdadm /dev/md7 -a /dev/sdc8 (wait for sync)
11. mdadm --grow /dev/md7 -z size (say 200KB, make sure sdc7/8 has enough space)

The /proc/mdstat would show resync hangs.

Additional info:
Issue also discussed in:

http://ww w.issociate.de/board/post/233625/RAID_5_Grow.html
Comment 1 Wendy Cheng 2005-08-23 01:00:03 EDT
Created attachment 117988 [details]
upstream patch that fixes the issue.
Comment 4 Wendy Cheng 2005-08-23 14:36:00 EDT
Acknowledgment goes to Tom Callahan (the customer) who brought up this issue
(and patch) to Red Hat. 
Comment 8 Doug Ledford 2006-03-22 04:13:01 EST
The patch posted here is not what was finally accepted upstream.  I'm making a
new patch that handles Stephen's questions and matches upstream.  Once testing
is complete, I'll post for review.
Comment 9 Doug Ledford 2006-03-23 17:15:18 EST
I've completed my testing and the problem, as well as another related problem,
are now fixed.  I'm submitting the revised patch internally for review/inclusion
in the next update release.
Comment 10 Doug Ledford 2006-04-03 00:07:34 EDT
The one line change to the fit variable was accepted upstream, so this patch now
very closely mirrors the final upstream and has also been integrated into the
latest kernel builds.
Comment 11 Jason Baron 2006-04-03 13:50:06 EDT
committed in stream U4 build 34.11. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 13 Bob Johnson 2006-04-11 12:53:08 EDT
This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.
Comment 17 Red Hat Bugzilla 2006-08-10 17:15:54 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html

Note You need to log in before you can comment on or make changes to this bug.