Bug 2207726

Summary: Unable to boot if /boot is xfs filesystem with extent hole
Product: Red Hat Enterprise Linux 8 Reporter: Frank Sorenson <fsorenso>
Component: grub2Assignee: Bootloader engineering team <bootloader-eng-team>
Status: POST --- QA Contact: Release Test Team <release-test-team>
Severity: high Docs Contact:
Priority: high    
Version: 8.7CC: mlewando, nfrayer, prjagtap, raravind, sbarcomb, sgardner
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Sorenson 2023-05-16 16:14:28 UTC
Description of problem:

If /boot is on an xfs filesystem, and the directory contains a hole in the extent allocation, booting fails with "error: not a correct xfs inode"

Version-Release number of selected component (if applicable):

2.02-142.el8_7.3.x86_64


How reproducible:

should be very easy


Steps to Reproduce:

1) have older kernel installed
2) create several hundred files with long names in /boot:

    # for i in {001..300} ; do
        touch /boot/filler_filename____________________________________${i}
      done

3) install new kernel version rpms
4) remove the filler files:

    # rm -f /boot/filler_filename*

5) verify that /boot has a hole:

    # xfs_bmap /boot
    boot:
	0: [0..7]: 748422104..748422111
	1: [8..15]: hole
	2: [16..23]: 748422080..748422087

6) reboot


Actual results:

booting fails with "error: not a correct xfs inode"


Expected results:

booting succeeds


Additional info:

bug also affects RHEL 7, but is fixed in RHEL 9

Comment 3 Marta Lewandowska 2023-05-23 09:27:04 UTC
Reproduced using 2.02-142.el8_7.3.x86_64. (wasn't actually that easy to get the hole; I had to try it a few times on different machines)

This patch is included in the latest RHEL-8 grub-- 2.02-148.el8-- and therefore in released 8.8. Is this alright for you / the customer or does this need to be backported into specific release(s)? Notably, 8.7.z is already EOL so that patch won't make it there.

Comment 4 Marta Lewandowska 2023-05-23 11:49:43 UTC
Actually, the patch should be in grub2-2.02-123.el8_6.15 (most recent RHEL-8.6 grub) as well.

Comment 7 raravind 2023-07-06 10:50:08 UTC
Marta, I think the patch mentioned in comment#1 is not included in any of the RHEL 8's.I recently checked 8.8 and 8.9 grub(grub2-2.02-148.el8 | grub2-2.02-150.el8) and couldn't find there.However it is included in RHEL 9.I couldn't reproduce the error in recent RHEL 9.3.

Comment 9 Marta Lewandowska 2023-07-19 13:41:25 UTC
Reshmi's totally right in comment#7... not sure what I was seeing before..!

Ok, so that means we have a patch and we can do SanityOnly testing since it's difficult to reproduce.

Comment 10 Marta Lewandowska 2023-07-19 18:42:10 UTC
Today's a rough day for me! ;)
I have to take back what's in comment#9... I downloaded the src rpm, looked for the patch (xfs: Don't attempt to iterate over empty directory) and didn't find it, but then forgot to apply patches before looking at the code itself. Seems I knew to do this correctly in the past!
Turns out (+1 to nfrayer!) that it's in release-to-master.patch
So now, also after more info from Nicolas, we think this patch is not enough to fix this issue, especially since it's not very reproducible...