Bug 506550
Summary: | gfs2_fsck does not check EA dinodes properly, wrong ea_type cannot be repaired if EA<block_size | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Jaroslav Kortus <jkortus> | ||||
Component: | gfs2-utils | Assignee: | Robert Peterson <rpeterso> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 5.4 | CC: | edamato | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-07-15 22:49:01 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
I verified that this is indeed a bug, although this kind of corruption should be rare (if even possible) in the field. I haven't debugged it yet, but I verified it's not fixed by my latest extensive changes to fsck.gfs2 for bug #500483. It should (hopefully) take less than a day to debug this and write a fix. I recommend we fix it in RHEL5.5. Changing status to assigned and requesting ack flags. I figured out the problem. This is actually a regression from this commit from June 2006: http://git.fedoraproject.org/git/?p=cluster.git;a=commitdiff;h=b7a4317df0f9493d30aba84fd3451de61e506b89 However, the extended attribute processing code is so intertwined with my work for bug #500483 that I'm just going to roll the fix into that one. My latest patch for bug #500483 repairs the damage, so I'm going to close this bug as a duplicate. Good catch! *** This bug has been marked as a duplicate of bug 500483 *** |
Created attachment 348308 [details] metadata of GFS2 filesystem with wrong ea_type field Description of problem: gfs2_fsck (and gfs_fsck too) fails to check EA dinodes if these are smaller than block (no indirect addressing needed). It works correctly (detects&removes) for large (>block_size) EA dinodes. It seems to me that it fails to locate the EA if they are this size (see the output below). When gfs2_fsck is run on filesystem with small EA node and the ea_type field is invalid (0x99 for example) it fails to detect this error and exits with 0. If this filesystem is mounted and such a file accessed the node is withdrawn from the cluster and filesystem cannot be accessed. This can be a bit confusing as gfs2_fsck still thinks its all OK. Metadata example of corrupted ea_type is attached. Tested on x86_64. Version-Release number of selected component (if applicable): gfs-utils-0.1.19-3.el5 gfs2-utils-0.1.58-1.el5 kmod-gfs-0.1.33-2.el5 GFS fsck 0.1.19 (built May 4 2009 19:34:42) GFS2 fsck 0.1.58 (built May 29 2009 15:43:58) How reproducible: always Steps to Reproduce: 1. create fresh FS (gfs2 or gfs1) 2. mount it with "-o acl" and create file on it (file-01) 3. force some xattr population: for i in `seq 50000 50100`; do setfacl -m u:$i:rw mountedFS/file-01 ; done 4. unmount and change the ea_type field in EA dinode of the file to 0x99 5. run gfs2_fsck on the filesystem, it exits 0. Notice in "-v -v" output that there is no EA dionde found 6. mount the filesystem and try stat the file Actual results: Filesystem error is not detected and run-time filesystem panic can occur. Expected results: Filesystem error is detected (as it is for "large" EA dinodes) and repaired. Additional info: Output snip from fsck on FS containing one small EA dinode: Pass1b complete Starting pass1c Looking for inodes containing ea blocks... Pass1c complete The same for larger EA: Looking for inodes containing ea blocks... EA in inode 4656 (0x1230) (pass1c.c:266) Found eattr at 7105 (0x1bc1) (metawalk.c:674) Extended attributes exist for inode #4656 (0x1230). (metawalk.c:609) Checking EA indirect block #7105 (0x1bc1). (metawalk.c:571) Checking EA leaf block #4657 (0x1231). (pass1c.c:202) Pointers Required: 2 Pointers Reported: 2 (metawalk.c:571) Checking EA leaf block #7106 (0x1bc2). (metawalk.c:571) Checking EA leaf block #7107 (0x1bc3). Pass1c complete /var/log/messages on file access: Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: fatal: filesystem consistency error Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: inode = 24/24 Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: function = ea_foreach_i Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: file = /builddir/build/BUILD/gfs-kmod-0.1.33/_kmod_build_/src/gfs/eattr.c, line = 134 Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: time = 1245259256 Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: about to withdraw from the cluster Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: telling LM to withdraw Jun 17 13:20:56 dell-pe1855-02 kernel: GFS: fsid=a3cluster:a3gfs2.0: withdrawn Jun 17 13:20:56 dell-pe1855-02 kernel: Jun 17 13:20:56 dell-pe1855-02 kernel: Call Trace: Jun 17 13:20:56 dell-pe1855-02 kernel: [<ffffffff88607fcc>] :gfs:gfs_lm_withdraw+0xc4/0xd3 Jun 17 13:20:56 dell-pe1855-02 kernel: [<ffffffff800bd580>] delayacct_end+0x5d/0x86 Jun 17 13:20:56 dell-pe1855-02 kernel: [<ffffffff80064a18>] __wait_on_bit+0x60/0x6e Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80015a20>] sync_buffer+0x0/0x3f Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80064a92>] out_of_line_wait_on_bit+0x6c/0x78 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8861fb87>] :gfs:gfs_consist_inode_i+0x3d/0x42 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885f4b62>] :gfs:gfs_dreread+0x87/0xc7 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885f96e2>] :gfs:ea_foreach_i+0x108/0x118 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885f9751>] :gfs:ea_foreach+0x5f/0x178 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885fae25>] :gfs:ea_find_i+0x0/0x6b Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885f98a3>] :gfs:gfs_ea_find+0x39/0x46 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885fb0e7>] :gfs:gfs_ea_get_i+0x22/0x88 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885f9efa>] :gfs:gfs_ea_get+0x70/0x87 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff800640dd>] wait_for_completion+0x1f/0xa2 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff88614e86>] :gfs:gfs_getxattr+0x93/0xa4 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8012a2d0>] inode_doinit_with_dentry+0x176/0x47c Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff800312e8>] d_splice_alias+0xd4/0xfb Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff886152d7>] :gfs:gfs_lookup+0x3e2/0x41a Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff885fe04d>] :gfs:lock_on_glock+0x66/0x6d Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80128917>] avc_has_perm+0x43/0x55 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8000d57d>] do_lookup+0xe5/0x1e6 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8000a8db>] __link_path_walk+0xa01/0xf42 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8000f043>] link_path_walk+0x42/0xb2 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8000d31d>] do_path_lookup+0x270/0x2e7 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80012e0b>] getname+0x15b/0x1c2 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80023cdf>] __user_walk_fd+0x37/0x4c Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8003f4de>] vfs_lstat_fd+0x18/0x47 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff80025c6d>] filldir+0x0/0xb7 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8002af75>] sys_newlstat+0x19/0x31 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8005e229>] tracesys+0x71/0xe0 Jun 17 13:20:57 dell-pe1855-02 kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0 Jun 17 13:20:57 dell-pe1855-02 kernel: Jun 17 13:20:57 dell-pe1855-02 kernel: inode_doinit_with_dentry: getxattr returned 5 for dev=dm-1 ino=24