Bug 2127096

Summary: [INTEL 8.7 BUG] [VROC] Reshape is started with not allowed chunk size
Product: Red Hat Enterprise Linux 8 Reporter: kinga.tanska
Component: mdadmAssignee: XiaoNi <xni>
Status: CLOSED ERRATA QA Contact: Fine Fan <ffan>
Severity: unspecified Docs Contact:
Priority: high    
Version: 8.7CC: blazej.kucman, ffan, kinga.tanska, lukasz.florczak, mariusz.tkaczyk, mateusz.grzonka, mateusz.kusiak, ncroxon, pragyansri.pathi, xni
Target Milestone: rcKeywords: Triaged
Target Release: 8.8   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: mdadm-4.2-7.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-16 09:09:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description kinga.tanska 2022-09-15 10:45:27 UTC
Description of problem:
New version of mdadm, which was added to RedHat 8.7 contains defect. We are working on resolution. This bug is being opened to highlight that defect was found and fix is in progress. Mdadm will need to be updated to the version with fix, when fix will be done. I will report it in comment.

Version-Release number of selected component (if applicable):
mdadm - v4.2 - 2021-12-30 - 5

How reproducible:
always

Steps to Reproduce:
1. Create container and RAID
# mdadm -CR /dev/md/imsm -e imsm -n2 /dev/nvme0n1 /dev/nvme1n1
# mdadm -CR  /dev/md/vol -l0 --chunk=16 -n2 /dev/nvme0n1 /dev/nvme1n1
2. Wait for resync if required.
3. Try process migration with incorrect chunk size
# mdadm --grow /dev/md/vol --chunk=256

Actual results:
- mdadm returns expected output but reshape is started with 0K/sec speed

- mdadm output:
mdadm: freesize and superblock must be set for autolayout, aborting
mdadm: level of /dev/md/r0d2s16_A_0 changed to raid4
mdadm: platform does not support a chunk size of: 256
mdadm: IMSM RAID geometry validation failed.  Array r0d2s16_A activation is blocked.

Expected results:
- mdadm returns output containing “ “mdadm: platform does not support a chunk size of: 256"”
- reshape does not start

Comment 1 Nigel Croxon 2022-09-22 13:25:37 UTC
I have a fix, but I need more testing time on different configurations.

As a workaround:

After this command "mdadm -CR  /dev/md/vol -l0 --chunk=16 -n2 /dev/nvme0n1 /dev/nvme1n1" is executed,
there will be a new file, in the filesystem under "/sys/block/md12x/md/sync_max"
Where the "md12x" will need to be replaced with the letter the filesystem created. My example will use md126.

The workaround is to echo "max" into this new file.
echo max > /sys/block/md126/md/sync_max

-Nigel

Comment 2 Nigel Croxon 2022-10-14 15:03:42 UTC
A fix will be coming via mdadm.

I have reassigned this bz to my coworker Xiao.
He will be the one pulling in upstream fixes for mdadm.

-Nigel

Comment 3 kinga.tanska 2022-10-20 12:01:19 UTC
Hi,

I've sent fix to upstream review (https://marc.info/?l=linux-raid&m=166626666516136&w=2 ).
I will post comment here if patch will be taken into upstream.

Kinga

Comment 4 XiaoNi 2022-11-15 08:57:20 UTC
(In reply to kinga.tanska from comment #3)
> Hi,
> 
> I've sent fix to upstream review
> (https://marc.info/?l=linux-raid&m=166626666516136&w=2 ).
> I will post comment here if patch will be taken into upstream.
> 
> Kinga

Hi Kinga

I can't open the link in comment3. 

Is it the right patch to fix this bug:

[PATCH v2] super-intel: make freesize not required for chunk size migration

This patch hasn't been merged by upstream. There is a risk that this bug
can't be fixed in time. From the description, it should not affect the
normal use, right? If so, can we push this to 8.9? 

Thanks
Xiao

Comment 5 kinga.tanska 2022-11-15 11:18:03 UTC
Hi,

yes, the patch that you mentioned fixed it, and it is not yet merged.
It affects few scenarios of reshape, but not the normal use. It's ok to not have it in 8.7.
I hope that it will be merged soon and pushed to 8.8 and 8.9.

Regards
Kinga

Comment 9 XiaoNi 2023-01-06 13:50:01 UTC
*** Bug 2141042 has been marked as a duplicate of this bug. ***

Comment 10 Fine Fan 2023-01-19 09:24:41 UTC
mdadm-4.2-7.el8 has pass the sanity test.

Comment 14 Fine Fan 2023-02-10 03:34:40 UTC
[root@smicro-s110p-01 ~]# rpm -q mdadm
mdadm-4.2-7.el8.x86_64


[root@smicro-s110p-01 ~]#  mdadm -CR /dev/md/imsm -e imsm -n2 /dev/nvme0n1 /dev/nvme1n1
mdadm: /dev/nvme0n1 appears to be part of a raid array:
       level=container devices=0 ctime=Wed Dec 31 19:00:00 1969
mdadm: /dev/nvme1n1 appears to be part of a raid array:
       level=container devices=0 ctime=Wed Dec 31 19:00:00 1969
mdadm: container /dev/md/imsm prepared.


[root@smicro-s110p-01 ~]# mdadm -CR  /dev/md/vol -l0 --chunk=16 -n2 /dev/nvme0n1 /dev/nvme1n1
mdadm: /dev/nvme0n1 appears to be part of a raid array:
       level=container devices=0 ctime=Wed Dec 31 19:00:00 1969
mdadm: /dev/nvme1n1 appears to be part of a raid array:
       level=container devices=0 ctime=Wed Dec 31 19:00:00 1969
mdadm: Creating array inside imsm container md127
mdadm: array /dev/md/vol started.

[root@smicro-s110p-01 ~]#  mdadm --grow /dev/md/vol --chunk=256
mdadm: platform does not support a chunk size of: 256

[root@smicro-s110p-01 ~]# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid0] 
md126 : active raid0 nvme1n1[1] nvme0n1[0]
      1953513472 blocks super external:/md127/0 16k chunks
      
md127 : inactive nvme1n1[1](S) nvme0n1[0](S)
      2210 blocks super external:imsm
       
unused devices: <none>
[root@smicro-s110p-01 ~]#

Comment 16 errata-xmlrpc 2023-05-16 09:09:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (mdadm bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2998