| Summary: | Devices added to degraded md RAID10 array with o2 layout do not become active | |||
|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Alexander Murashkin <alexandermurashkin> | |
| Component: | kernel | Assignee: | Jes Sorensen <Jes.Sorensen> | |
| Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 16 | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 802691 (view as bug list) | Environment: | ||
| Last Closed: | 2012-03-19 14:29:24 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 802691 | |||
Spent a bunch of time looking through this one. However consulting Neil who is the upstream mdadm maintainer, I received the following explanation: "o2 places data thus: A B C D D A B C where columns are devices. You've created an array with no place to store B. mdadm or really shouldn't let you do that. That is the bug." In other words, this is not a bug, the real issue is that mdadm shouldn't allow you to create such a setup in the first place. Cheers, Jes > You've created an array with no place to store B.
> mdadm or really shouldn't let you do that. That is the bug.
I can see it for the example below
mdadm --create /dev/md25 --raid-devices=4 --chunk=512 --level=raid10
--layout=o2 --assume-clean /dev/sdc1 missing missing /dev/sdf1
But in real life I created RAID10 array using 4 disks. After 2 disks were disconnected for whatever reason I was not able to add them back.
So as I understand RAID10 with o2 layout does not survive loss of any two adjacent disks (1+2, 2+3, 3+4, 4+1). When in the case of n2 layout only 2 combinations are not survivable (1+2, 3+4).
So o2 is less reliable than n2. In my opinion, it shall be mentioned in the documentation.
|
Description of problem: If md RAID10 array with o2 layout is degraded there is no way to recover it. Devices added to such array become spare devices and are not used by kernel for the recovery. Right after the devices are added kernel prints syslog message "md/raid10:mdNN: insufficient working devices for recovery" Version-Release number of selected component (if applicable): kernel-3.2.5-3.fc16.x86_64 How reproducible: Steps to Reproduce: 1. Create 4 identical partitions, for example, /dev/sd[cdef]1 2. mdadm --create /dev/md25 --raid-devices=4 --chunk=512 --level=raid10 --layout=o2 --assume-clean /dev/sdc1 missing missing /dev/sdf1 3. mdadm /dev/md25 --add /dev/sdd1 4. mdadm /dev/md25 --add /dev/sde1 5. mdadm --detail /dev/md25 .... 0 8 33 0 active sync /dev/sdc1 4 8 49 - spare /dev/sdd1 5 8 65 - spare /dev/sde1 3 8 81 3 active sync /dev/sdf1 Actual results: Added devices do not become active. Recovery is started but fails immediately with syslog message "insufficient working devices for recovery" Expected results: Added devices become active after successful recovery. 0 8 33 0 active sync /dev/sdc1 4 8 49 1 active sync /dev/sdd1 5 8 65 2 spare rebuilding /dev/sde1 3 8 81 3 active sync /dev/sdf1 Additional info: I checked that this problem does not happen for the default RAID10 layout (near=2). The md25 array is a test array. I also have degraded production array with o2 layout that cannot be recovered. -------- using layout offset=2 ---------------------- Syslog messages are at the bottom. # mdadm --create /dev/md25 --raid-devices=4 --chunk=512 --level=raid10 --layout=o2 --assume-clean /dev/sdc1 missing missing /dev/sdf1 # mdadm /dev/md25 --add /dev/sdd1 # mdadm /dev/md25 --add /dev/sde1 # mdadm --detail /dev/md25 /dev/md25: Version : 1.2 Creation Time : Tue Feb 14 01:38:52 2012 Raid Level : raid10 Array Size : 1054720 (1030.17 MiB 1080.03 MB) Used Dev Size : 527360 (515.09 MiB 540.02 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Tue Feb 14 01:39:55 2012 State : clean, degraded Active Devices : 2 Working Devices : 4 Failed Devices : 0 Spare Devices : 2 Layout : offset=2 Chunk Size : 512K Name : glaive.castle.aimk.com:25 (local to host glaive.castle.aimk.com) UUID : 72e4ed21:5ba59fbc:a4402111:62aa08db Events : 21 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 0 0 1 removed 2 0 0 2 removed 3 8 81 3 active sync /dev/sdf1 4 8 49 - spare /dev/sdd1 5 8 65 - spare /dev/sde1 -------- using default layout near=2 ---------------------- # mdadm --create /dev/md25 --raid-devices=4 --chunk=512 --level=raid10 --assume-clean /dev/sdc1 missing missing /dev/sdf1 # mdadm /dev/md25 --add /dev/sdd1 # mdadm /dev/md25 --add /dev/sde1 # mdadm --detail /dev/md25 dev/md25: Version : 1.2 Creation Time : Tue Feb 14 01:37:17 2012 Raid Level : raid10 Array Size : 1055744 (1031.17 MiB 1081.08 MB) Used Dev Size : 527872 (515.59 MiB 540.54 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Tue Feb 14 01:37:37 2012 State : clean, degraded, recovering Active Devices : 2 Working Devices : 4 Failed Devices : 0 Spare Devices : 2 Layout : near=2 Chunk Size : 512K Rebuild Status : 35% complete Name : glaive.castle.aimk.com:25 (local to host glaive.castle.aimk.com) UUID : ccaca5de:69982fad:d64d233b:436c5618 Events : 11 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 4 8 49 1 spare rebuilding /dev/sdd1 2 0 0 2 removed 3 8 81 3 active sync /dev/sdf1 5 8 65 - spare /dev/sde1 [root@glaive md]# mdadm --detail /dev/md25 /dev/md25: Version : 1.2 Creation Time : Tue Feb 14 01:37:17 2012 Raid Level : raid10 Array Size : 1055744 (1031.17 MiB 1081.08 MB) Used Dev Size : 527872 (515.59 MiB 540.54 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Tue Feb 14 01:37:53 2012 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : near=2 Chunk Size : 512K Rebuild Status : 73% complete Name : glaive.castle.aimk.com:25 (local to host glaive.castle.aimk.com) UUID : ccaca5de:69982fad:d64d233b:436c5618 Events : 40 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 4 8 49 1 active sync /dev/sdd1 5 8 65 2 spare rebuilding /dev/sde1 3 8 81 3 active sync /dev/sdf1 [root@glaive md]# mdadm --detail /dev/md25 /dev/md25: Version : 1.2 Creation Time : Tue Feb 14 01:37:17 2012 Raid Level : raid10 Array Size : 1055744 (1031.17 MiB 1081.08 MB) Used Dev Size : 527872 (515.59 MiB 540.54 MB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Update Time : Tue Feb 14 01:37:53 2012 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : near=2 Chunk Size : 512K Name : glaive.castle.aimk.com:25 (local to host glaive.castle.aimk.com) UUID : ccaca5de:69982fad:d64d233b:436c5618 Events : 40 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 4 8 49 1 active sync /dev/sdd1 5 8 65 2 active sync /dev/sde1 3 8 81 3 active sync /dev/sdf1 --- syslog for offset=2 ------------------------------------------------ Feb 14 01:39:50 glaive kernel: [ 5378.754962] md: bind<sde1> Feb 14 01:39:50 glaive kernel: [ 5378.782013] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.782015] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.782017] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.782018] disk 1, wo:1, o:1, dev:sde1 Feb 14 01:39:50 glaive kernel: [ 5378.782020] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.782026] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.782027] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.782029] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.782030] disk 1, wo:1, o:1, dev:sde1 Feb 14 01:39:50 glaive kernel: [ 5378.782032] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.782033] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.786470] md: recovery of RAID array md25 Feb 14 01:39:50 glaive kernel: [ 5378.786472] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Feb 14 01:39:50 glaive kernel: [ 5378.786474] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. Feb 14 01:39:50 glaive kernel: [ 5378.786477] md: using 128k window, over a total of 527360k. Feb 14 01:39:50 glaive kernel: [ 5378.786560] md/raid10:md25: insufficient working devices for recovery. Feb 14 01:39:50 glaive kernel: [ 5378.786573] md: md25: recovery done. Feb 14 01:39:50 glaive kernel: [ 5378.869515] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.869518] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.869521] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.869523] disk 1, wo:1, o:1, dev:sde1 Feb 14 01:39:50 glaive kernel: [ 5378.869526] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.869528] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.869570] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.869573] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.869575] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.869578] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.869580] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.869585] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.869586] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.869588] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.869590] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.869593] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.869594] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.869596] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.869598] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.869605] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.869606] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.869608] RAID10 conf printout: Feb 14 01:39:50 glaive kernel: [ 5378.869609] --- wd:2 rd:4 Feb 14 01:39:50 glaive kernel: [ 5378.869610] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:50 glaive kernel: [ 5378.869611] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:50 glaive kernel: [ 5378.869613] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:50 glaive kernel: [ 5378.869639] md: recovery of RAID array md25 Feb 14 01:39:50 glaive kernel: [ 5378.869641] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Feb 14 01:39:50 glaive kernel: [ 5378.869642] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. Feb 14 01:39:50 glaive kernel: [ 5378.869645] md: using 128k window, over a total of 527360k. Feb 14 01:39:50 glaive kernel: [ 5378.869796] md/raid10:md25: insufficient working devices for recovery. Feb 14 01:39:50 glaive kernel: [ 5378.869809] md: md25: recovery done. Feb 14 01:39:51 glaive kernel: [ 5378.907833] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.907836] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.907839] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.907841] disk 2, wo:1, o:1, dev:sdd1 Feb 14 01:39:51 glaive kernel: [ 5378.907843] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:51 glaive kernel: [ 5378.911009] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.911012] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.911014] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.911016] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:51 glaive kernel: [ 5378.911022] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.911023] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.911025] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.911027] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:51 glaive kernel: [ 5378.911029] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.911030] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.911032] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.911035] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:51 glaive md: RebuildFinished /dev/md25 [ clean, degraded ] Feb 14 01:39:51 glaive kernel: [ 5378.960666] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.960668] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.960670] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.960672] disk 3, wo:0, o:1, dev:sdf1 Feb 14 01:39:51 glaive kernel: [ 5378.960673] RAID10 conf printout: Feb 14 01:39:51 glaive kernel: [ 5378.960674] --- wd:2 rd:4 Feb 14 01:39:51 glaive kernel: [ 5378.960676] disk 0, wo:0, o:1, dev:sdc1 Feb 14 01:39:51 glaive kernel: [ 5378.960677] disk 3, wo:0, o:1, dev:sdf1