Bug 493727
| Summary: | GFS: gfs_fsck can delete everything in a corrupt file system | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Robert Peterson <rpeterso> | ||||||||
| Component: | gfs-utils | Assignee: | Robert Peterson <rpeterso> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | low | ||||||||||
| Version: | 5.3 | CC: | edamato, hlawatschek, jkortus, rrottmann, sghosh | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2009-09-02 11:01:30 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Robert Peterson
2009-04-02 20:26:03 UTC
Created attachment 338068 [details]
First patch
Here's the culprit. This patch is cross-written from gfs2.
Duplicate block processing was not returning the proper number of
leaf entries. The "metablock" scanner took that to mean there were
no directory entries, and therefore, it should destroy the whole
root directory. I'm testing the fix now.
Requesting ack flags--we have the fix in hand. The attached patch has been pushed to the master branch of the gfs1-utils git tree, and the STABLE3, STABLE2 and RHEL5 branches of the cluster git tree for inclusion into 5.4. It has been tested on system roth-01. Changing status to Modified. I have tested very simple FS corruption and it produced quite interesting results. Corruption description: 1. new GFS filesystem is created (mkfs.gfs -O -t a3cluster:a3gfs2 -p lock_nolock -j 2 -J 32 /dev/GFSVG/GFS) 2. FS mounted and 3x10M files created (file-01 file-02 file-03) 3. FS umounted 4. gfs2_edit used to create duplicate link from first to second file. The last link in first section of first file (first link pointing to block containing data) was made the same for first and second file. In other words the first data block of file-02 (or whatever is 2nd file in FS) is the same as in first file. Note: the same scenario is usable for indirect links (links to another block of links) and for gfs2_fsck@GFS2. Versions used: x86_64: GFS fsck 0.1.19 (built May 4 2009 19:34:42) ia64: GFS fsck 0.1.19 (built May 4 2009 19:35:05) And now gfs_fsck -y was run. The corruption was fixed on ia64 but not on x86_64. ia64: gfs_fsck -y /dev/sdc1 Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Found dup block at 88 Block 88 has 2 inodes referencing it fora total of 2 duplicate references Inode (null) has 1 reference(s) to block 88 Clearing... Found dup in inode "unknown name" (block #24) with block #88 inode is in directory 0 Pass1b complete Starting pass1c Pass1c complete Starting pass2 Found directory entry 'file-02' in block 23 to something not a file or directory! Directory entry 'file-02' cleared Entries is 6 - should be 5 for 23 Entries updated Pass2 complete Starting pass3 Pass3 complete Starting pass4 Pass4 complete Starting pass5 ondisk and fsck bitmaps differ at block 24 Succeeded. ondisk and fsck bitmaps differ at block 2648 Succeeded. RG #1 free count inconsistent: is 20715 should be 20717 RG #1 used inode count inconsistent: is 9 should be 8 Resource group counts updated Pass5 complete Writing changes to disk x86_64: gfs_fsck -y /dev/GFSVG/GFS Initializing fsck Clearing journals (this may take a while). Journals cleared. Starting pass1 Pass1 complete Starting pass1b Found dup block at 88 Block 88 has 2 inodes referencing it fora total of 2 duplicate references Inode (null) has 1 reference(s) to block 88 Clearing... make: *** [checkgfs1] Segmentation fault (core dumped) I will attach metadata of the corrupted FS (x86 version) and backtrace from the core file. Created attachment 347563 [details]
Core file backtrace
backtrace of x86_64 core from gfs_fsck
Created attachment 347564 [details]
metadata of FS with the corruption
Metadata containing the simple corruption described in the comments.
verified with gfs-utils-0.1.20-1.el5 it no longer deletes everything if crosslinked files are found in root directory. passed crosslink test on x86_64 and ia64. to fully fix the filesystem the gfs_fsck has to be run twice. Second run fixes bitmap differences. This applies until bug 509225 is fixed. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1336.html |