Description of problem: Reshaping a 3-disk RAID5 to 4-disk RAID6 hangs, restore from critical section impossible Version-Release number of selected component (if applicable): mdadm-4.1-4.el7.x86_64 kernel-3.10.0-1127.el7.x86_64 How reproducible: always Steps to Reproduce: truncate -s 1G disk1 truncate -s 1G disk2 truncate -s 1G disk3 truncate -s 1G disk4 DEVS=($(losetup --find --show disk1)) DEVS+=($(losetup --find --show disk2)) DEVS+=($(losetup --find --show disk3)) ADD=$(losetup --find --show disk4) mdadm --create /dev/md0 --level=5 --raid-devices=3 "${DEVS[@]}" mdadm --wait /dev/md0 mdadm /dev/md0 --add "$ADD" mdadm --grow /dev/md0 --level=6 --raid-devices=4 --backup-file=mdadm.backup Actual results: hanged at the beginning of of migration: # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] 2093056 blocks super 1.2 level 6, 512k chunk, algorithm 18 [4/3] [UUU_] [>....................] reshape = 0.0% (1/1046528) finish=0.0min speed=261632K/sec unused devices: <none> Expected results: a RAID6 array with previously existing data Additional info: mdadm --stop /dev/md0 mdadm --assemble /dev/md0 "${DEVS[@]}" $ADD --backup-file=mdadm.backup mdadm: Failed to restore critical section for reshape, sorry.
Test version of mdadm: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=27903664
mdadm-4.1-njc.el7_8.src.rpm seems to have helped: Personalities : [raid6] [raid5] [raid4] md0 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] 2093056 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none>
Issue #1 In the how to reproduce steps... you must specify a full path on the --backup_file= (--backup_file=/home/hkario/mdadm.spec). Issue #2 That I am encountering is: Apr 29 13:38:38 localhost setroubleshoot[3993]: SELinux is preventing /usr/sbin/mdadm from 'read, write' accesses on the file mdadm.backup. For complete SELinux messages run: sealert -l dbf13fb1-be67-46e0-b49a-4abfa57856a0 Apr 29 13:38:38 localhost platform-python[3993]: SELinux is preventing /usr/sbin/mdadm from 'read, write' accesses on the file mdadm.backup.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that mdadm should be allowed read write access on the mdadm.backup file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'mdadm' --raw | audit2allow -M my-mdadm#012# semodule -X 300 -i my-mdadm.pp#012 I executed: ausearch -c 'mdadm' --raw | audit2allow -M my-mdadm semodule -i my-mdadm.pp and then follow the reproduce steps. And the array synchronized.
I see no code changes needed to kernel MD nor userspace mdadm. Steps to Reproduce: (except change the backup sub-dir to your own) truncate -s 1G disk1 truncate -s 1G disk2 truncate -s 1G disk3 truncate -s 1G disk4 DEVS=($(losetup --find --show disk1)) DEVS+=($(losetup --find --show disk2)) DEVS+=($(losetup --find --show disk3)) ADD=$(losetup --find --show disk4) ausearch -c 'mdadm' --raw | audit2allow -M my-mdadm semodule -i my-mdadm.pp mdadm --create /dev/md0 --level=5 --raid-devices=3 "${DEVS[@]}" mdadm --wait /dev/md0 mdadm /dev/md0 --add "$ADD" mdadm --grow /dev/md0 --level=6 --raid-devices=4 --backup-file=/home/ncroxon/mdadm.backup
turning off selinux appears to work also # setenforce 0
(In reply to Nigel Croxon from comment #4) > Issue #1 > In the how to reproduce steps... you must specify a full path on the > --backup_file= (--backup_file=/home/hkario/mdadm.spec). then why it worked with the mdadm-4.1-njc.el7_8 package? passing full path name doesn't change behaviour either, still hangs on first block > Issue #2 That I am encountering is: > Apr 29 13:38:38 localhost setroubleshoot[3993]: SELinux is preventing > /usr/sbin/mdadm from 'read, write' accesses on the file mdadm.backup. For > complete SELinux messages run: sealert -l > dbf13fb1-be67-46e0-b49a-4abfa57856a0 > Apr 29 13:38:38 localhost platform-python[3993]: SELinux is preventing > /usr/sbin/mdadm from 'read, write' accesses on the file > mdadm.backup.#012#012***** Plugin catchall (100. confidence) suggests > **************************#012#012If you believe that mdadm should be > allowed read write access on the mdadm.backup file by default.#012Then you > should report this as a bug.#012You can generate a local policy module to > allow this access.#012Do#012allow this access for now by executing:#012# > ausearch -c 'mdadm' --raw | audit2allow -M my-mdadm#012# semodule -X 300 -i > my-mdadm.pp#012 then it looks to me like mdadm should verify the read/write access to the file before start of the reshape/grow also, I don't see any AVC denials: [root@ci-vm-10-0-138-143 ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] 2093056 blocks super 1.2 level 6, 512k chunk, algorithm 18 [4/3] [UUU_] [>....................] reshape = 0.0% (1/1046528) finish=3.0min speed=5508K/sec unused devices: <none> [root@ci-vm-10-0-138-143 ~]# ausearch -m avc -ts today <no matches> > I executed: > ausearch -c 'mdadm' --raw | audit2allow -M my-mdadm > semodule -i my-mdadm.pp > and then follow the reproduce steps. > > And the array synchronized. > turning off selinux appears to work also > # setenforce 0 and exactly consistent with the the lack of AVC denials, disabling selinux doesn't change anything, the process still hangs on first block with the mdadm-4.1-5.el7.x86_64 package
mdadm-4.1-njc.el7_8 package is a test package that I made. It does not contain a proper fix. If SElinux is not enabled, they mdadm should have a clear path to write/read the the FS. Download the latest mdadm and retry: http://download-node-02.eng.bos.redhat.com/nightly/RHEL-8.3.0-20200430.n.0/compose/BaseOS/x86_64/os/Packages/mdadm-4.1-13.el8.x86_64.rpm
(In reply to Nigel Croxon from comment #8) > mdadm-4.1-njc.el7_8 package is a test package that I made. It does not > contain a proper fix. > > > If SElinux is not enabled, they mdadm should have a clear path to write/read > the the FS. > > Download the latest mdadm and retry: > http://download-node-02.eng.bos.redhat.com/nightly/RHEL-8.3.0-20200430.n.0/ > compose/BaseOS/x86_64/os/Packages/mdadm-4.1-13.el8.x86_64.rpm [root@ci-vm-10-0-139-17 ~]# rpm -ivh mdadm-4.1-13.el8.x86_64.rpm warning: mdadm-4.1-13.el8.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY error: Failed dependencies: libc.so.6(GLIBC_2.27)(64bit) is needed by mdadm-4.1-13.el8.x86_64 libc.so.6(GLIBC_2.28)(64bit) is needed by mdadm-4.1-13.el8.x86_64 libreport-filesystem is needed by mdadm-4.1-13.el8.x86_64 dracut < 034-1 conflicts with mdadm-4.1-13.el8.x86_64
A RHEL7.X version for testing. https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=28318976
yes, kernel-3.10.0-1136.el7.x86_64 with mdadm-4.1-njc2.el7.x86_64 works as expected: # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 loop3[4] loop2[3] loop1[1] loop0[0] 2093056 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU] unused devices: <none>
https://www.spinics.net/lists/raid/msg64347.html
https://marc.info/?l=linux-raid&m=159195299630680&w=2 Verified the above patch fixes the hang and allows the grow to proceed.