Bug 166541 - mdadm --grow infinite resync
Summary: mdadm --grow infinite resync
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: Doug Ledford
QA Contact: Brian Brock
Depends On:
Blocks: 181409 185624
TreeView+ depends on / blocked
Reported: 2005-08-23 05:00 UTC by Wendy Cheng
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Clone Of:
Last Closed: 2006-08-10 21:15:48 UTC

Attachments (Terms of Use)
upstream patch that fixes the issue. (454 bytes, patch)
2005-08-23 05:00 UTC, Wendy Cheng
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 04:00:00 UTC

Description Wendy Cheng 2005-08-23 05:00:03 UTC
Description of problem:

Customer reports "mdadm --grow" command goes into an infinite loop of resync.

Version-Release number of selected component (if applicable):

Steps to Reproduce:
0. Create 4 partitions, say sdc1, sdc2, sdc7, sdc8, where sdc7 and sdc8 are
larger than sdc1 and sdc2 (customer uses 53 GB vs. 19GB, I use 800MB vs. 1.5GB). 
1. mdadm -Cv /dev/md7 -l1 -n2 /dev/sdc1 /dev/sdc2 
2. mkfs.ext3 /dev/md7
3. mkdir /mnt/tmp 
4. mount /dev/md7 /mnt/tmp
5. mdadm /dev/md7 -f /dev/sdc1 (fail the device)
6. mdadm /dev/md7 -r /dev/sdc1 (remove the device)
7. mdadm /dev/md7 -a /dev/sdc7 (mirror to the bigger device, wait for sync to
8. mdadm /dev/md7 -f /dev/sdc2 (fail device)
9. mdadm /dev/md7 -r /dev/sdc2 (remove device)
10. mdadm /dev/md7 -a /dev/sdc8 (wait for sync)
11. mdadm --grow /dev/md7 -z size (say 200KB, make sure sdc7/8 has enough space)

The /proc/mdstat would show resync hangs.

Additional info:
Issue also discussed in:

http://ww w.issociate.de/board/post/233625/RAID_5_Grow.html

Comment 1 Wendy Cheng 2005-08-23 05:00:03 UTC
Created attachment 117988 [details]
upstream patch that fixes the issue.

Comment 4 Wendy Cheng 2005-08-23 18:36:00 UTC
Acknowledgment goes to Tom Callahan (the customer) who brought up this issue
(and patch) to Red Hat. 

Comment 8 Doug Ledford 2006-03-22 09:13:01 UTC
The patch posted here is not what was finally accepted upstream.  I'm making a
new patch that handles Stephen's questions and matches upstream.  Once testing
is complete, I'll post for review.

Comment 9 Doug Ledford 2006-03-23 22:15:18 UTC
I've completed my testing and the problem, as well as another related problem,
are now fixed.  I'm submitting the revised patch internally for review/inclusion
in the next update release.

Comment 10 Doug Ledford 2006-04-03 04:07:34 UTC
The one line change to the fit variable was accepted upstream, so this patch now
very closely mirrors the final upstream and has also been integrated into the
latest kernel builds.

Comment 11 Jason Baron 2006-04-03 17:50:06 UTC
committed in stream U4 build 34.11. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/

Comment 13 Bob Johnson 2006-04-11 16:53:08 UTC
This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.

Comment 17 Red Hat Bugzilla 2006-08-10 21:15:54 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.