Bug 817535

Summary: MD raid devices inactive after reboot done during reshape
Product: [Fedora] Fedora Reporter: Jes Sorensen <Jes.Sorensen>
Component: mdadmAssignee: Jes Sorensen <Jes.Sorensen>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: agk, dledford, Jes.Sorensen, lukasz.dorau, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 817522 Environment:
Last Closed: 2012-06-07 02:52:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 817522    
Bug Blocks:    

Description Jes Sorensen 2012-04-30 12:24:34 UTC
+++ This bug was initially created as a clone of Bug #817522 +++

Description of problem:
After reboot done during reshape, MD raid devices are inactive. Volume cannot
be assembled.

Version-Release number of selected component (if applicable):
004-53.el6

How reproducible:
Always

Steps to Reproduce:
1. Issue following commands:
export MDADM_EXPERIMENTAL=1
mdadm -Ss
mdadm --zero-superblock /dev/sd[a-d]
mdadm -C /dev/md/imsm0 -amd -e imsm -n 2 /dev/sda /dev/sdb -R
mdadm -C /dev/md/r1d2n3s0-5 -amd -l1  --size 5G -n 2 /dev/sda /dev/sdb -R -f
# wait untill resync finishes
mdadm /dev/md/imsm0 --add /dev/sdc
mdadm /dev/md/imsm0 --add /dev/sdd
rm -f /tmp/backup
mdadm -G /dev/md/r1d2n3s0-5 -l0 --backup-file=/tmp/backup  --force
sleep 1
rm -f /tmp/backup
mdadm -G /dev/md/imsm0 -n 4 --backup-file=/tmp/backup --force
2. Wait untill the progress of reshape is about 3% 
3. Reboot the system

Actual results:
After reboot reshape process is stopped. MD raid devices are inactive. Volume
cannot be assembled.

Expected results:
After reboot reshape continues and finishes. Volume can be assembled.

Additional info:
1. Kernel command-line:
kernel /boot/vmlinuz-2.6.32-131.0.15.el6.x86_64 ro
root=UUID=705f79ed-f2ca-4138-9de5-d33c763122ce rd_NO_LUKS rd_NO_LVM rd_NO_DM
LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=pl2
crashkernel=auto rdinfo rdinitdebug

2. /etc/fstab:
UUID=705f79ed-f2ca-4138-9de5-d33c763122ce  /               ext3    defaults    
   1 1
UUID=b9976a0b-8354-4c21-becc-2e1c753f11b3  swap            swap    defaults    
   0 0
UUID=c86463f2-90ae-42ea-9fa5-739ec9dff4aa  /mnt/sdf2 ext3    defaults        0
0
tmpfs      /dev/shm tmpfs   defaults        0 0
devpts      /dev/pts devpts  gid=5,mode=620  0 0
sysfs      /sys  sysfs   defaults        0 0
proc      /proc  proc    defaults        0 0

3. Device listing:
$ dmsetup ls --tree
No devices found

4. List of block device attributes:
/dev/sda: VERSION="1.2.01" TYPE="isw_raid_member" USAGE="raid" 
ID_FS_VERSION=1.2.01
ID_FS_TYPE=isw_raid_member
ID_FS_USAGE=raid
/dev/sdb: VERSION="1.2.01" TYPE="isw_raid_member" USAGE="raid" 
ID_FS_VERSION=1.2.01
ID_FS_TYPE=isw_raid_member
ID_FS_USAGE=raid
/dev/sdc: VERSION="1.2.01" TYPE="isw_raid_member" USAGE="raid" 
ID_FS_VERSION=1.2.01
ID_FS_TYPE=isw_raid_member
ID_FS_USAGE=raid
/dev/sdd: VERSION="1.2.01" TYPE="isw_raid_member" USAGE="raid" 
ID_FS_VERSION=1.2.01
ID_FS_TYPE=isw_raid_member
ID_FS_USAGE=raid

5. Output of `dmesg | grep dracut` command - in attachment ("dmesg-dracut"
file)

6. Output of `dmesg` command - in attachment ("dmesg-all" file)

7. /etc/dracut.conf file - in attachment

8 [details]. /proc/mdstat (reshape does not continue)
Personalities : [raid6] [raid5] [raid4] 
md124 : active raid4 sdb[3] sda[2] sdc[1] sdd[0]
      5242880 blocks super external:/md127/0 level 4, 64k chunk, algorithm 0
[5/4] [UUUU_]
      [=>...................]  reshape =  7.5% (397312/5242880) finish=127.7min
speed=631K/sec

md127 : inactive sdc[3](S) sdd[2](S) sdb[1](S) sda[0](S)
      12612 blocks super external:imsm

unused devices: <none>

9. mdadm version:  v3.2.2 - 17th June 2011


Additional info:

Adam Kwolek has just send the series of 12 patches to linux-raid. These patches
fix this issue:

FIX: Do not try to (continue) reshape using inactive array
FIX: restart reshape when reshape process is stopped just between 2 reshapes
imsm: FIX: Clear migration record when migration switches to next volume.
imsm: FIX: use md position to reshape restart
FIX: use md position to reshape restart
imsm: FIX: Chunk size migration problem
Flush mdmon before next reshape step during container operation
Fix: Sometimes mdmon throws core dump during reshape
imsm: FIX: imsm_get_allowed_degradation() doesn't count degradation for raid1
FIX: Array is not run when expansion disks are added
imsm: FIX: No new missing disks are allowed during general migration
FIX: NULL pointer to strdup() can be passed

Comment 1 Fedora Update System 2012-05-02 10:16:16 UTC
mdadm-3.2.3-9.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.3-9.fc16

Comment 2 Fedora Update System 2012-05-03 07:27:57 UTC
Package mdadm-3.2.3-9.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.2.3-9.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-7145/mdadm-3.2.3-9.fc16
then log in and leave karma (feedback).

Comment 3 Fedora Update System 2012-05-10 14:52:14 UTC
mdadm-3.2.4-2.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.4-2.fc16

Comment 4 Fedora Update System 2012-05-15 16:43:24 UTC
mdadm-3.2.4-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.4-3.fc16

Comment 5 Fedora Update System 2012-05-21 15:15:45 UTC
mdadm-3.2.5-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.5-1.fc16

Comment 6 Fedora Update System 2012-06-07 02:52:58 UTC
mdadm-3.2.5-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.