Bug 805648 - mdadm reshape raid1->raid5 1TB is slow (10 days)
Summary: mdadm reshape raid1->raid5 1TB is slow (10 days)
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-21 18:09 UTC by Jan Kratochvil
Modified: 2012-11-14 20:30 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-11-14 20:30:42 UTC
Type: ---


Attachments (Terms of Use)

Description Jan Kratochvil 2012-03-21 18:09:09 UTC
Description of problem:
It is described on many sites as a known bug, reshaping is too slow.

Version-Release number of selected component (if applicable):
kernel-3.3.0-2.fc16.x86_64

How reproducible:
Tried on two devices.

Steps to Reproduce:

I want to do:
# mdadm -G /dev/md125 -z7143424 -l5 -n3 -a /dev/sdc3
mdadm: cannot change component size at the same time as other changes.
   Change size first, then check data is intact before making other changes.
-z is there to make the device sizes aligned allowing larger chunk size.

Not possible, so I do it in steps:
md125 : active raid1 sdb3[1] sda3[2]
      7166964 blocks super 1.2 [2/2] [UU]
# mdadm -G /dev/md125 -l5 -n3 -a /dev/sdc3
mdadm: level of /dev/md125 changed to raid5
mdadm: added /dev/sdc3
mdadm: Need to backup 128K of critical section..
but this way chunk is only 8KB - is it the problem?
md125 : active raid5 sdc3[3] sdb3[1] sda3[2]
      7166964 blocks super 1.2 level 5, 4k chunk, algorithm 2 [3/3] [UUU]
      [>....................]  reshape =  0.8% (63492/7166964) finish=5.6min speed=21121K/sec
later:
# mdadm -G /dev/md125 -z7143424 
mdadm: component size of /dev/md125 has been set to 7143424K
# mdadm -G /dev/md125 -c512
mdadm: /dev/md125: Cannot grow - need backup-file
# mdadm -G /dev/md125 -c512 --backup-file=/tmp/backup
later:
md125 : active raid5 sdc3[3] sdb3[1] sda3[2]
      14286848 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
  
This is bearable for 8GB partition but for 1TB partition it takes ~10 days:
      968566648 blocks super 1.2 level 5, 8k chunk, algorithm 2 [3/3] [UUU]
      [>....................]  reshape =  3.9% (38187304/968566648) finish=15607.7min speed=993K/sec

# iostat -cdmx 60 2
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.04    5.93   13.61    0.00   80.42
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda            1281.77   307.42  891.05   16.17     8.49     1.26    22.01     2.54    2.80    2.53   17.49   0.67  60.99
sdb            1278.83   309.45  892.45   14.13     8.48     1.26    22.01     2.50    2.76    2.51   18.80   0.70  63.73
sdc               0.00   531.47    0.10   93.08     0.00    14.69   322.82    10.03   96.92    9.00   97.01   1.76  16.38
(system is otherwise idle)

When the whole disk can be read in 1-3 hours (I do not remember) there must be a way to reshape it in less than a day, shouldn't it?

Comment 1 Jan Kratochvil 2012-03-22 07:49:49 UTC
It got faster today, it seems it is just slow in its initial phase.
md126 : active raid5 sdc4[3] sdb4[1] sda4[2]
      968566648 blocks super 1.2 level 5, 8k chunk, algorithm 2 [3/3] [UUU]
      [====>................]  reshape = 23.5% (227826304/968566648) finish=490.2min speed=25183K/sec

Comment 2 Jan Kratochvil 2012-03-24 17:10:44 UTC
This is still inefficient, the later command
  mdadm -G /dev/md126 -c512 --backup-file=/fs-on-sdc/backup
takes about 4 days, this speed (slowness) is not changing the whole time:

md126 : active raid5 sdc4[3] sdb4[1] sda4[2]
      1937113088 blocks super 1.2 level 5, 8k chunk, algorithm 2 [3/3] [UUU]
      [================>....]  reshape = 83.8% (812577280/968556544) finish=578.1min speed=4496K/sec

Comment 3 Dave Jones 2012-10-23 15:41:43 UTC
# Mass update to all open bugs.

Kernel 3.6.2-1.fc16 has just been pushed to updates.
This update is a significant rebase from the previous version.

Please retest with this kernel, and let us know if your problem has been fixed.

In the event that you have upgraded to a newer release and the bug you reported
is still present, please change the version field to the newest release you have
encountered the issue with.  Before doing so, please ensure you are testing the
latest kernel update in that release and attach any new and relevant information
you may have gathered.

If you are not the original bug reporter and you still experience this bug,
please file a new report, as it is possible that you may be seeing a
different problem. 
(Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).

Comment 4 Justin M. Forbes 2012-11-14 20:30:42 UTC
With no response, we are closing this bug under the assumption that it is no longer an issue. If you still experience this bug, please feel free to reopen the bug report.


Note You need to log in before you can comment on or make changes to this bug.