Description of problem:
Customer reports "mdadm --grow" command goes into an infinite loop of resync.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
0. Create 4 partitions, say sdc1, sdc2, sdc7, sdc8, where sdc7 and sdc8 are
larger than sdc1 and sdc2 (customer uses 53 GB vs. 19GB, I use 800MB vs. 1.5GB).
1. mdadm -Cv /dev/md7 -l1 -n2 /dev/sdc1 /dev/sdc2
2. mkfs.ext3 /dev/md7
3. mkdir /mnt/tmp
4. mount /dev/md7 /mnt/tmp
5. mdadm /dev/md7 -f /dev/sdc1 (fail the device)
6. mdadm /dev/md7 -r /dev/sdc1 (remove the device)
7. mdadm /dev/md7 -a /dev/sdc7 (mirror to the bigger device, wait for sync to
8. mdadm /dev/md7 -f /dev/sdc2 (fail device)
9. mdadm /dev/md7 -r /dev/sdc2 (remove device)
10. mdadm /dev/md7 -a /dev/sdc8 (wait for sync)
11. mdadm --grow /dev/md7 -z size (say 200KB, make sure sdc7/8 has enough space)
The /proc/mdstat would show resync hangs.
Issue also discussed in:
Created attachment 117988 [details]
upstream patch that fixes the issue.
Acknowledgment goes to Tom Callahan (the customer) who brought up this issue
(and patch) to Red Hat.
The patch posted here is not what was finally accepted upstream. I'm making a
new patch that handles Stephen's questions and matches upstream. Once testing
is complete, I'll post for review.
I've completed my testing and the problem, as well as another related problem,
are now fixed. I'm submitting the revised patch internally for review/inclusion
in the next update release.
The one line change to the fit variable was accepted upstream, so this patch now
very closely mirrors the final upstream and has also been integrated into the
latest kernel builds.
committed in stream U4 build 34.11. A test kernel with this patch is available
This issue is on Red Hat Engineering's list of planned work items
for the upcoming Red Hat Enterprise Linux 4.4 release. Engineering
resources have been assigned and barring unforeseen circumstances, Red
Hat intends to include this item in the 4.4 release.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.