Description of problem: When the statfs system file is missing due to a GFS2 file system corruption, gfs2_fsck segfaults. Version-Release number of selected component (if applicable): gfs2-utils-0.1.62-1.el5 How reproducible: Executing gfs2_fsck/fsck.gfs2 on a corrupt gfs2 file system with missing statfs system file Steps to Reproduce: 1. gfs2_fsck /dev/path-to-gfs2-filesystem 2. 3. Actual results: Initializing fsck Recovering journals (this may take a while).. Journal recovery complete. Segmentation fault Expected results: gfs2_fsck should rebuild/reconstruct the missing statfs file, and continue fixing the corrupted file system Additional info: log entry looks like: Mar 11 08:30:54 server-name kernel: GFS2: fsid=share:directory.1: can't read in statfs inode: -2
Created attachment 402119 [details] Preliminary patch This patch fixed the problem and recreated their statfs file. I do not consider it ready to ship because this version cannot recreate a damaged rindex file, root file system or master system directory. Those are much more complex issues that will need some forethought and design work.
Requesting ack flags to get this into a release.
Created attachment 437675 [details] Patch from 09 Aug 2010 The fix to repair and/or recreate missing system files was greatly enhanced upstream. This patch is a RHEL5 back-port of those enhancements, which includes statfs. It still needs some testing. It should also be noted that I'm not done with this effort in general. Because of deadlines I had to push out the upstream patch into RHEL6 for bug #576330. This is basically a crosswrite of that effort. But after that bug went out, another one came in for RHEL5, bug #620384, in which it was discovered that missing journals cause a problem. So I think it's prudent to ship this one as is and continue the work under bug #620384 so I don't have too many patches outstanding. I'll push it as soon as I get it properly tested.
This patch is now properly tested on roth-01. I ran fsck.gfs2 against all the metadata sets I have that fit on roth-01's 775GB SAN. I pushed the patch to the RHEL56 branch of the cluster.git repository for inclusion into 5.6. SHA1 is c5311da. The patch is already upstream by virtue of the aforementioned bug records. Changing status to POST until this gets built into a formal package.
Build 2768496 successful. Changing status to Modified. This fix is in gfs2-utils-0.1.62-25.el5.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0135.html