Bug 1717513 - fsck.gfs2 does not fix corruption that causes an error when moving or removing a file on a gfs2 filesystem
Summary: fsck.gfs2 does not fix corruption that causes an error when moving or removin...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: gfs2-utils
Version: 8.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: 8.4
Assignee: Andrew Price
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On: 1721973
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-05 16:23 UTC by Shane Bradley
Modified: 2023-08-10 15:41 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-26 20:29:00 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Checker script (831 bytes, text/plain)
2019-06-05 17:13 UTC, Robert Peterson
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4197461 0 None None None 2019-06-05 16:23:08 UTC

Comment 1 Robert Peterson 2019-06-05 17:13:58 UTC
Created attachment 1577642 [details]
Checker script

I wrote this tiny script to compare the values of the formal inode
number in the directory entry versus the dinode. Example usage:

Parameters are: <device> <path from that device>

Run it on a known good file:

# /home/bob/tools/check_formal_ino.sh /dev/fsck/case02236799files /fecho/ioc_mestre/gis/eb_docs_02_gfg.sas7bdat
fecho  -> 		 0x3b3ed9d
ioc_mestre  -> 		 0x3b3eda2
gis  -> 		 0x3b40fe4
eb_docs_02_gfg.sas7bdat  -> 		 0x3b50e7c

Dinode: 0x344
Dirent: 0x344

Notice that the dinode agrees with the dirent that the formal inode
number is 0x344. Now run it on a known bad file:

# /home/bob/tools/check_formal_ino.sh /dev/fsck/case02236799files /fecho/ioc_mestre/gis_erro_sistema/eb_docs_02_gfg.sas7bdat
fecho  -> 		 0x3b3ed9d
ioc_mestre  -> 		 0x3b3eda2
gis_erro_sistema  -> 		 0x3b589b7
eb_docs_02_gfg.sas7bdat  -> 		 0x464011c

Dinode: 0x1f4d1
Dirent: 0x1f54b

Comment 4 Andrew Price 2019-06-13 14:11:07 UTC
Test case:

#!/bin/bash

DEV=/dev/foo
MNT=/mnt/test

mkfs.gfs2 -O -p lock_nolock $DEV
mount $DEV $MNT
touch $MNT/foo
INODE=$(ls -i $MNT/foo | cut -f1 -d' ')
umount $MNT
gfs2_edit -p $INODE field di_num.no_formal_ino 42 $DEV
fsck.gfs2 -y $DEV
if [ $? -ne 1 ]; then
	echo "FAIL: fsck.gfs2 did not find and fix the bad dentry" >&2
	exit 1
fi
fsck.gfs2 -n $DEV
if [ $? -ne 0 ]; then
	echo "FAIL: fsck.gfs2 found problems, fs should be clean" >&2
	exit 1
fi

Comment 14 Dave Wysochanski 2020-09-10 20:38:25 UTC
Should this bug be closed or moved to RHEL8?

Comment 18 Dave Wysochanski 2022-09-26 20:29:00 UTC
Doesn't look like anyone has been working on this for 2 years, single case is closed.

Should this be a CLOSE or what is expected?

Comment 20 Dave Wysochanski 2022-09-26 20:31:33 UTC
Sorry, didn't mean to close until owner and reporter agree on next steps.

Comment 21 Andrew Price 2022-09-27 11:20:12 UTC
I would like to keep this one open to track the remaining work. It's blocked behind the general fsck.gfs2 performance work (which itself is blocked at the moment) so it will likely take some time.

Comment 23 Dave Wysochanski 2023-03-20 20:33:23 UTC
(In reply to Andrew Price from comment #21)
> I would like to keep this one open to track the remaining work. It's blocked
> behind the general fsck.gfs2 performance work (which itself is blocked at
> the moment) so it will likely take some time.

Is this still in the works?  Can you give an update on it, and maybe add in depends on bugs as needed?

Comment 24 Andrew Price 2023-03-21 10:27:24 UTC
(In reply to Dave Wysochanski from comment #23)
> Is this still in the works?  Can you give an update on it, and maybe add in
> depends on bugs as needed?

Yes this still needs to be addressed and it's currently sitting on the backlog. Some groundwork for the general performance issues went into gfs2-utils 3.5.0 but we still need to find a way to implement the formal_ino check without the considerable performance hit so the "Cond NAK: Design" is still valid. A depends on 1721973 is already set so I think the bz state is up-to-date.


Note You need to log in before you can comment on or make changes to this bug.