Bug 2218020

Summary: e2fsck is unable to correct error reported by kernel that "No space for directory leaf checksum. Please run e2fsck -D."
Product: Red Hat Enterprise Linux 8 Reporter: Frank Sorenson <fsorenso>
Component: e2fsprogsAssignee: Nobody <nobody>
Status: NEW --- QA Contact: Boyang Xue <bxue>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.7CC: casl, xzhou
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Sorenson 2023-06-27 20:59:30 UTC
Description of problem:

The kernel is reporting "No space for directory leaf checksum. Please run e2fsck -D." for multiple filesystems, but 'e2fsck -D' is unable to correct the error.

As a result, the contents problematic directory is completely inaccessible; no file can be opened or removed.  The end result is the data loss of all files in the directory.


Version-Release number of selected component (if applicable):

e2fsprogs-1.45.6-5.el8.x86_64
more recent Fedora e2fsprogs-1.46.5-3.fc37 is also unable to correct


How reproducible:

Unknown, however the customer has 4 filesystems on one system which are all reporting the error.


Steps to Reproduce:

unknown


Actual results:

'e2fsck -D' is unable to correct the error


Expected results:

'e2fsck -D' corrects the error, as described in the kernel error message


Additional info:

All of the customer's filesystems have a blocksize of 1024 bytes, which seems likely to be contributing to the issue.

Comment 2 Frank Sorenson 2023-06-27 21:13:50 UTC
listing the problematic directory or attempting to open or remove entries in the directory results in an EBADMSG error:

    ls: cannot access 'lastseen/by_mac/AP02cb72f947.csv': Bad message

    # rm AP02cb72f947.csv
    rm: cannot remove 'AP02cb72f947.csv': Bad message

Comment 3 Frank Sorenson 2023-06-27 21:15:08 UTC
Note: the e2fsck error when attempting to correct the issue is:

    Failed to optimize directory /lastseen/by_mac (30941743): Directory block does not have space for checksum

Comment 5 Frank Sorenson 2023-06-29 15:45:37 UTC
I can confirm that disabling checksums for the filesystem will make the filesystem usable again:

# fsck.ext4 -f misc.e2i
# tune2fs -ff -O^metadata_csum misc.e2i

Each step takes ages, but it works.


I also attempted to replicate the issue on a scratch filesystem, using the filenames of the entries in the problematic filesystem, but was unable to recreate the error.