Bug 241096 - GFS: bug in gfs truncate
GFS: bug in gfs truncate
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs-kmod (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Robert Peterson
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-05-24 00:38 EDT by Vasily Averin
Modified: 2010-01-11 22:26 EST (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2007-0577
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-07 15:06:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vasily Averin 2007-05-24 00:38:04 EDT
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> from Virtuozzo/OpenVz linux kernel team
has discowered following issue:

[GFS] bug in gfs truncate

GFS enforces a strange invariant: tree of indirect blocks
must be constructed for all the blocks potentially allocateable
inside inode size. Technically this is a stupid requirement,
but both gfs kernel and user space gfs_fsck consider inodes
with bad size as totally invalid and destroy them.

It means that GFS has to be especially careful to maintain
correct inode size. One point of failure is truncate():
if quota does not allow to grow inode size, it leaves i_size
at wrong value. Subsequent write to inode moves wrong i_size
to disk size and height of tree remains not updated. Such inode
is invalid and it is destroyed by subsequent fsck.

diff -urp ../gfs-kmod-0.1.16-orig/gfs-kernel-0.1.16/src/gfs/ops_inode.c
gfs-kernel-0.1.16/src/gfs/ops_inode.c
--- ../gfs-kmod-0.1.16-orig/gfs-kernel-0.1.16/src/gfs/ops_inode.c	2007-05-20
19:18:19.000000000 +0400
+++ gfs-kernel-0.1.16/src/gfs/ops_inode.c	2007-05-20 00:43:03.000000000 +0400
@@ -1395,8 +1507,11 @@ gfs_setattr(struct dentry *dentry, struc
 		}
 
 		error = gfs_truncatei(ip, attr->ia_size, gfs_truncator_page);
-		if (error)
+		if (error) {
+			if (inode->i_size != ip->i_di.di_size)
+				i_size_write(inode, ip->i_di.di_size);
 			goto fail;
+		}
 
 		if ((sdp->sd_vfs->s_flags & MS_SYNCHRONOUS) &&
 		    !gfs_is_jdata(ip))
Comment 1 Robert Peterson 2007-06-26 10:51:45 EDT
I looked at the code and I don't see how this can be a problem.

The suggested code is fixing the inode size on error, but this is
the wrong place to do it.  All the functions, like gfs_truncatei
have error paths that are supposed to clean up after themselves and
as far as I can see, the code is doing that properly.

I ran this quick test to see if my theory was correct.  I put a 1M
quota on a gfs file system, then exceeded that quota.  The inode
information seemed to be correct.  Here's what I did:

[root@trin-10 root]# mount -tgfs /dev/trin_vg/hell /mnt/hell/
[root@trin-10 root]# gfs_quota limit -l 1 -u testmonkey -f /mnt/hell
[root@trin-10 root]# chmod 777 /mnt/hell
[root@trin-10 root]# su testmonkey
[testmonkey@trin-10 root]$ cd /mnt/hell
[testmonkey@trin-10 hell]$ mkdir monkey
[testmonkey@trin-10 hell]$ cd monkey/
[testmonkey@trin-10 monkey]$ dd if=/dev/zero of=/mnt/hell/1000K bs=1K count=1000
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 0.20244 seconds, 5.1 MB/s
[testmonkey@trin-10 monkey]$ cp /home/devel/gfs2-2.6.git.tgz /mnt/hell/monkey/
GFS: fsid=bob_cluster2:test_gfs.0: quota exceeded for user 500
cp: writing `/mnt/hell/monkey/gfs2-2.6.git.tgz': Disk quota exceeded
[testmonkey@trin-10 monkey]$ exit
[root@trin-10 root]# ls -lr /mnt/hell/monkey/gfs2-2.6.git.tgz 
-rw-r--r-- 1 testmonkey testmonkey 16384 Jun 26 09:27
/mnt/hell/monkey/gfs2-2.6.git.tgz
[root@trin-10 ../gfs-kernel/src/gfs]# stat !$
stat /mnt/hell/monkey/gfs2-2.6.git.tgz
  File: `/mnt/hell/monkey/gfs2-2.6.git.tgz'
  Size: 16384           Blocks: 40         IO Block: 4096   regular file
Device: fe03h/65027d    Inode: 30          Links: 1
Access: (0644/-rw-r--r--)  Uid: (  500/testmonkey)   Gid: (  500/testmonkey)
Access: 2007-06-26 09:27:16.000000000 -0500
Modify: 2007-06-26 09:27:16.000000000 -0500
Change: 2007-06-26 09:27:16.000000000 -0500

Maybe I misunderstood what the problem is.  Please give an
example of how to recreate the quota problem and what bad value(s)
you are seeing.
Comment 2 Monakhov Dmitriy 2007-06-27 08:25:54 EDT
>All the functions, like gfs_truncatei
>have error paths that are supposed to clean up after themselves and
>as far as I can see, the code is doing that properly.
Actually error path just not exist. So we have to fix i_size explicitly.

Simple test case:
/* gcc  trunc.c -otrunc */
#include <unistd.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
        int fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
        lseek(fd, 1024*1024*1024, SEEK_SET);
        /* i_size = 0, di_size = 0 */
        ftruncate(fd, 1024*1024*1024); /* this have to fail because of quota*/
        /* after trucate failed we have:i_size = 1073741824, di_size = 0 */
        write(fd, "", 1);
        close(fd);
}

# mount /dev/vzvg/gfs  /mnt/
su testmonkey;
$ touch /mnt/file # create file for later test
$ dd if=/dev/zero of=/mnt/BIG_FILE # write until -EDQUOT
$ ./trunc /mnt/file # test itself 
$ exit
# sync;sync;sync # force commit journal
#umount /mnt/; gfs_fsck /dev/vzvg/gfs
Initializing fsck
Clearing journals (this may take a while).....
Journals cleared.
Starting pass1
Found unused inode marked in-use
Clear unused inode at block 30? (y/n) y
Pass1 complete      
Starting pass1b
Pass1b complete      
Starting pass1c
Pass1c complete      
Starting pass2
Pass2 complete      
Starting pass3
Pass3 complete      
Starting pass4
Pass4 complete      
Starting pass5
ondisk and fsck bitmaps differ at block 30
Fix bitmap for block 30? (y/n) y
Succeeded.
RG #1 free count inconsistent: is 63398 should be 63399
RG #1 used inode count inconsistent: is 8 should be 7
Update resource group counts? (y/n) y
Resource group counts updated
Pass5 complete      
Writing changes to disk
Comment 3 RHEL Product and Program Management 2007-06-29 18:03:34 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 4 Robert Peterson 2007-06-29 18:06:07 EDT
After looking at the code path and how gfs calls vmtruncate, I agree
that this is the correct fix.  Crosswrite for gfs2 has already been
proposed, but not yet in the git tree yet.
The suggested patch was tested on system trin-10 and fixes the problem.
I'll try to squeeze it into RHEL5.1 but we're right at the cutoff, so
I don't know if it will make it there or 5.2.
Comment 6 Robert Peterson 2007-07-11 18:16:53 EDT
The fix was tested on system roth-01 using information in the bugzilla.
The fix was committed to RHEL5 and RHEL51 branches.  Changing the status
to modified.
Comment 9 errata-xmlrpc 2007-11-07 15:06:59 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0577.html

Note You need to log in before you can comment on or make changes to this bug.