Bug 1966712
Summary: | mdadm erroneously reports incorrect checksum | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Chris Moore <christopherm> |
Component: | mdadm | Assignee: | XiaoNi <xni> |
Status: | CLOSED ERRATA | QA Contact: | Fine Fan <ffan> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | CentOS Stream | CC: | alex.iribarren, bstinson, carl, daniel.vanderster, davide, dledford, ffan, jamien, janguyen, jdonohue, jwboyer, mharri, ncroxon, ngompa13, pcahyna, rmeggins, xni, yizhan |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 8.5 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | mdadm-4.2-rc1_3.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-11-09 20:02:50 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Moore
2021-06-01 17:55:32 UTC
The patch is at https://marc.info/?l=linux-raid&m=162259662926315&w=2. It needs to wait for ack from maintainer. I think I've found the issue. sb->bitmap_offset can be negative (in my case it's -16), but struct mdp_superblock_1 defines it as __u32, so it has to be cast to an int32_t before doing the math on it. mdadm version 4.1 had the cast, but it disappeared in version 4.2. The following change seems to fix the problem: $ git diff super1.c diff --git a/super1.c b/super1.c index c05e6237..f7981e3d 100644 --- a/super1.c +++ b/super1.c @@ -2631,7 +2631,7 @@ static int locate_bitmap1(struct supertype *st, int fd, int node_num) else ret = -1; - offset = __le64_to_cpu(sb->super_offset) + __le32_to_cpu(sb->bitmap_offset); + offset = __le64_to_cpu(sb->super_offset) + (int32_t) __le32_to_cpu(sb->bitmap_offset); if (node_num) { bms = (bitmap_super_t*)(((char*)sb)+MAX_SB_SIZE); bm_sectors_per_node = calc_bitmap_size(bms, 4096) >> 9; Hi Chris The patch is right. I've sent the link at comment1. Make comment1 not private. Sorry for this. By the way, are you only testing with super 1.0 or you use super 1.0 in product? If for product, why don't use super1.2? The reason I ask this question is that I want to know more responses from different people :) Regards Xiao (In reply to XiaoNi from comment #3) > Hi Chris > > The patch is right. I've sent the link at comment1. Make comment1 not > private. Sorry for this. > By the way, are you only testing with super 1.0 or you use super 1.0 in > product? If for product, > why don't use super1.2? The reason I ask this question is that I want to > know more responses > from different people :) > > Regards > Xiao That's an interesting question. This RAID was setup by the RHEL Anaconda installer. On the installer GUI we create a 512 MiB /boot/efi partition, and the remainder of the disk is mounted to /. Both are created as RAID 1 by selecting "RAID" as the type in the GUI. I don't know of any place that we can select the superblock format. But it's odd that we get 1.0, since I think 1.2 is the default for mdadm --create. (In reply to Chris Moore from comment #4) > (In reply to XiaoNi from comment #3) > > Hi Chris > > > > The patch is right. I've sent the link at comment1. Make comment1 not > > private. Sorry for this. > > By the way, are you only testing with super 1.0 or you use super 1.0 in > > product? If for product, > > why don't use super1.2? The reason I ask this question is that I want to > > know more responses > > from different people :) > > > > Regards > > Xiao > > That's an interesting question. This RAID was setup by the RHEL Anaconda > installer. On the installer GUI we create a 512 MiB /boot/efi partition, > and the remainder of the disk is mounted to /. Both are created as RAID 1 > by selecting "RAID" as the type in the GUI. I don't know of any place that > we can select the superblock format. But it's odd that we get 1.0, since I > think 1.2 is the default for mdadm --create. The installer automatically creates a RAID1 /boot partition as version 1.0 so that it can be read by a non-RAID aware boot loader. That used to be a requirement before grub2 was the norm. (In reply to Doug Ledford from comment #5) > (In reply to Chris Moore from comment #4) > > (In reply to XiaoNi from comment #3) > > > Hi Chris > > > > > > The patch is right. I've sent the link at comment1. Make comment1 not > > > private. Sorry for this. > > > By the way, are you only testing with super 1.0 or you use super 1.0 in > > > product? If for product, > > > why don't use super1.2? The reason I ask this question is that I want to > > > know more responses > > > from different people :) > > > > > > Regards > > > Xiao > > > > That's an interesting question. This RAID was setup by the RHEL Anaconda > > installer. On the installer GUI we create a 512 MiB /boot/efi partition, > > and the remainder of the disk is mounted to /. Both are created as RAID 1 > > by selecting "RAID" as the type in the GUI. I don't know of any place that > > we can select the superblock format. But it's odd that we get 1.0, since I > > think 1.2 is the default for mdadm --create. > > The installer automatically creates a RAID1 /boot partition as version 1.0 > so that it can be read by a non-RAID aware boot loader. That used to be a > requirement before grub2 was the norm. Sorry, misread your prior statement. We create /boot/efi as a 1.0 superblock array because the EFI partition must be BIOS readable and the BIOS doesn't know how to skip a superblock at the beginning of the device. It is a hard requirement that an EFI partition be superblock 1.0 as a result. This doesn't change regardless of the grub version in use (although we also used to create /boot partitions as superblock 1.0 when grub 1 was in use too). Exactly the same issue here, and it causes Stream 8 anaconda installation to fail on our hardware. Our anaconda raids are defined like this: ``` # partition table %pre #!/bin/sh DISKS=$(lsblk -d -o name,rota,fstype --noheadings | grep ^sd | grep -v -i 'LVM2_member' | awk '{if ($2=='0') print $1}' | head -n 2) ONE=$(echo ${DISKS} | cut -d ' ' -f 1) TWO=$(echo ${DISKS} | cut -d ' ' -f 2) # it is very important to only clearpart on the first two drives. # there are often other drives sdc, etc.. which can be OSD journals # or OSD data disks. We must not overwrite those partition tables. echo "ignoredisk --only-use=${ONE},${TWO}" > /tmp/part-include echo "clearpart --all --initlabel --drives ${ONE},${TWO}" >> /tmp/part-include # for /boot echo "partition raid.01 --size 1024 --ondisk ${ONE}" >> /tmp/part-include echo "partition raid.02 --size 1024 --ondisk ${TWO}" >> /tmp/part-include # for /boot/efi echo "partition raid.11 --size 256 --ondisk ${ONE}" >> /tmp/part-include echo "partition raid.12 --size 256 --ondisk ${TWO}" >> /tmp/part-include # for / echo "partition raid.21 --size 1 --ondisk ${ONE} --grow" >> /tmp/part-include echo "partition raid.22 --size 1 --ondisk ${TWO} --grow" >> /tmp/part-include echo "raid /boot --level=1 --device=boot --fstype=xfs raid.01 raid.02" >> /tmp/part-include echo "raid /boot/efi --level=1 --device=boot_efi --fstype=efi raid.11 raid.12" >> /tmp/part-include echo "raid / --level=1 --device=root --fstype=xfs raid.21 raid.22" >> /tmp/part-include %end # use the partition table defined above and dumped to file %include /tmp/part-include ``` Installation fails with: dasbus.error.DBusError: Process reported exit code 1: mdadm: RUN_ARRAY failed: Invalid argument dmesg shows: md126: bitmap superblock UUID mismatch md126: fialed to create bitmap (-22) And mdadm -E shows 1-bit checksum errors on the members of boot_efi, just like Chris posted. (In reply to Doug Ledford from comment #6) > > Sorry, misread your prior statement. We create /boot/efi as a 1.0 > superblock array because the EFI partition must be BIOS readable and the > BIOS doesn't know how to skip a superblock at the beginning of the device. > It is a hard requirement that an EFI partition be superblock 1.0 as a > result. This doesn't change regardless of the grub version in use (although > we also used to create /boot partitions as superblock 1.0 when grub 1 was in > use too). Hi Doug Thanks for the explanation. (In reply to Dan van der Ster from comment #7) > Exactly the same issue here, and it causes Stream 8 anaconda installation to > fail on our hardware. > Hi Dan Could you try this patch https://marc.info/?l=linux-raid&m=162259662926315&w=2 > Could you try this patch https://marc.info/?l=linux-raid&m=162259662926315&w=2
First, a clear reproducer for you with 4.2-rc1_1, maybe to add to some test framework:
```
# rpm -q mdadm
mdadm-4.2-rc1_1.el8.x86_64
# dd if=/dev/zero of=a.dat b
s=1M count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 0.107383 s, 2.5 GB/s
# dd if=/dev/zero of=b.dat b
s=1M count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 0.108689 s, 2.5 GB/s
# losetup /dev/loop0 a.dat
# losetup /dev/loop1 b.dat
# mdadm --create /dev/md0 --level=1 --metadata=1.0 --bitmap=internal --raid-devices=2 /dev/loop0 /dev/loop1
mdadm: RUN_ARRAY failed: Invalid argument
# mdadm -E /dev/loop1
/dev/loop1:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x0
Array UUID : c5d9f7f7:53213a6c:90828f3b:c5f8ba22
Name : 0
Creation Time : Wed Jun 9 11:08:45 2021
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 524256 sectors (255.98 MiB 268.42 MB)
Array Size : 262080 KiB (255.94 MiB 268.37 MB)
Used Dev Size : 524160 sectors (255.94 MiB 268.37 MB)
Super Offset : 524272 sectors
Unused Space : before=0 sectors, after=104 sectors
State : active
Device UUID : d9ca4edc:50ee669f:c3b6dd74:f42d1b02
Update Time : Wed Jun 9 11:08:45 2021
Bad Block Log : 512 entries available at offset -8 sectors
Checksum : 1ff86d78 - expected 1ff86d77
Events : 0
Device Role : Active device 1
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
# dmesg | tail
[161542.795797] md/raid1:md0: not clean -- starting background reconstruction
[161542.795798] md/raid1:md0: active with 2 out of 2 mirrors
[161542.795813] md0: invalid bitmap file superblock: bad magic
[161542.795815] md0: failed to create bitmap (-22)
[161542.795852] md: md0 stopped.
```
Now I built with that fix and it works:
```
# rpm -Uvh /tmp/mdadm/mdadm-4.2-rc1_2.el8.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:mdadm-4.2-rc1_2.el8 ################################# [ 50%]
Cleaning up / removing...
2:mdadm-4.2-rc1_1.el8 ################################# [100%]
# dd if=/dev/zero of=/dev/loop0
dd: writing to '/dev/loop0': No space left on device
524289+0 records in
524288+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 1.50245 s, 179 MB/s
# dd if=/dev/zero of=/dev/loop1
dd: writing to '/dev/loop1': No space left on device
524289+0 records in
524288+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 1.65947 s, 162 MB/s
# mdadm --create /dev/md0 --level=1 --metadata=1.0 --bitmap=internal --raid-devices=2 /dev/loop0 /dev/loop1
mdadm: array /dev/md0 started.
# mdadm -E /dev/loop0
/dev/loop0:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : 719d639a:d16c5b0b:188ca971:cbb9e114
Name : 0
Creation Time : Wed Jun 9 11:50:04 2021
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 524256 sectors (255.98 MiB 268.42 MB)
Array Size : 262080 KiB (255.94 MiB 268.37 MB)
Used Dev Size : 524160 sectors (255.94 MiB 268.37 MB)
Super Offset : 524272 sectors
Unused Space : before=0 sectors, after=96 sectors
State : clean
Device UUID : e6aa2d8d:43097efd:40b6e6fc:b1922f80
Internal Bitmap : -16 sectors from superblock
Update Time : Wed Jun 9 11:50:05 2021
Bad Block Log : 512 entries available at offset -8 sectors
Checksum : 9ed9b9d9 - correct
Events : 17
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
# dmesg | tail
[164021.431804] md/raid1:md0: not clean -- starting background reconstruction
[164021.431805] md/raid1:md0: active with 2 out of 2 mirrors
[164021.433174] md0: detected capacity change from 0 to 268369920
[164021.433229] md: resync of RAID array md0
[164022.621793] md: md0: resync done.
```
So I assume anaconda will also work.
(In reply to Dan van der Ster from comment #10) > > Could you try this patch https://marc.info/?l=linux-raid&m=162259662926315&w=2 > > First, a clear reproducer for you with 4.2-rc1_1, maybe to add to some test > framework: Hi Fine Could you add the test case to our regression test. > ``` > > > Now I built with that fix and it works: Thanks Xiao Recorded Adding. We're unable to install new machines with CS8 due to this bug. What is the ETA for a fix? I see @ncroxon set a target release of 8.6, which I'm not sure if that means a year... I'll ping the upstream maintainer again. I have sent the patch to upstream some days ago. We still have some time to fix this in 8.5. So change target to 8.5 again. Hello, I have encountered the problem with creating 1.0 metadata on EFI partition (from inside anaconda) too: # mdadm --create /dev/md/boot_efi --run --level=raid1 --raid-devices=2 --metadata=1.0 --bitmap=internal --chunk=512 /dev/vdb1 /dev/vdb2 mdadm: RUN_ARRAY failed: Invalid argument kernel says: [ 119.232426] md/raid1:md127: not clean -- starting background reconstruction [ 119.232426] md/raid1:md127: active with 2 out of 2 mirrors [ 119.233852] md127: invalid bitmap file superblock: bad magic [ 119.233856] md127: failed to create bitmap (-22) [ 119.233942] md: md127 stopped. With the updated mdadm-4.2-rc1_3.el8, I don't get this problem anymore, but I get another one: # mdadm --create /dev/md/boot_efi --run --level=raid1 --raid-devices=2 --metadata=1.0 --bitmap=internal --chunk=512 /dev/vdb1 /dev/vdb2 mdadm: specifying chunk size is forbidden for this level now this is not specific to 1.0 metadata, with 1.2 the same thing happens. In the previous version mdadm-4.2-rc1_2.el8, I have not had this problem when using 1.2 metadata. *** Bug 1917308 has been marked as a duplicate of this bug. *** no new issue found on mdadm-4.2-rc1_3.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (mdadm bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4494 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |