Bug 469773 - GFS2: gfs2_grow doesn't grow file system properly
GFS2: gfs2_grow doesn't grow file system properly
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs2-utils (Show other bugs)
5.3
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Robert Peterson
Cluster QE
: ZStream
: 491951 492932 (view as bug list)
Depends On:
Blocks: 483527
  Show dependency treegraph
 
Reported: 2008-11-03 17:57 EST by Nate Straz
Modified: 2010-01-11 22:41 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 482756 (view as bug list)
Environment:
Last Closed: 2009-09-02 07:01:39 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to fix the problem (1.84 KB, patch)
2009-01-21 08:54 EST, Robert Peterson
no flags Details | Diff
Patch for the block size problem (774 bytes, patch)
2009-01-21 11:11 EST, Robert Peterson
no flags Details | Diff

  None (edit)
Description Nate Straz 2008-11-03 17:57:00 EST
Description of problem:

Our test suite for growfs on GFS doesn't work on GFS2 after updating the commands because the file system size doesn't update until after the gfs2_grow command exits.

Version-Release number of selected component (if applicable):
gfs2-utils-0.1.49-1.el5

How reproducible:
100%

Steps to Reproduce:
1. df /mnt/gfs2
2. lvextend
3. gfs2_grow /mnt/gfs2; df /mnt/gfs2
4. Compare output from 1 and 3
  
Actual results:
lvextend -l +50%FREE growfs/gfs2 on west-02
growing gfs2 on west-01
verifying grow
size of gfs /mnt/gfs2 did not increase,
was: 79008, is now: 79008
after 1 seconds


Expected results:
The new size should be available immediately after gfs2_grow exits.

Additional info:
Comment 1 Nate Straz 2008-11-04 16:51:03 EST
Moving this out to RHEL 5.4.  This could cause problems with management tools which expect the grow to work right away, but it's too late in the 5.3 cycle to get this in.
Comment 2 Steve Whitehouse 2008-12-03 05:18:30 EST
We also need to look into what happens when we add new journals to a live filesystem. Currently they seem to be ignored, so that if a node were to mount the newly created journal and then fail, its journal might not be recoverable by one of the previously existing nodes.

This is a result of changing the jindex from a special file to a directory I think, as we no longer keep the shared lock on it all the time, like we used to.

I spotted this recently when looking at the recovery code.
Comment 3 Robert Peterson 2009-01-21 08:52:02 EST
While fixing this bug and testing the fix, I found another related
nasty bug in gfs2_grow.  It relates to alternate block sizes.  Here
is the symptom:

[root@roth-01 ../src/redhat/RPMS/x86_64]# lvcreate --name roth_lv -L 5G /dev/roth_vg
  Logical volume "roth_lv" created
[root@roth-01 ../src/redhat/RPMS/x86_64]# mkfs.gfs2 -O -b1024 -t bobs_roth:test_gfs -p lock_dlm -j 1 /dev/roth_vg/roth_lv
Device:                    /dev/roth_vg/roth_lv
Blocksize:                 1024
Device Size                5.00 GB (5242880 blocks)
Filesystem Size:           5.00 GB (5242878 blocks)
Journals:                  1
Resource Groups:           20
Locking Protocol:          "lock_dlm"
Lock Table:                "bobs_roth:test_gfs"

[root@roth-01 ../src/redhat/RPMS/x86_64]# mount -tgfs2 /dev/roth_vg/roth_lv /mnt/gfs2
[root@roth-01 ../src/redhat/RPMS/x86_64]# /usr/sbin/lvresize -L +1T /dev/roth_vg/roth_lv
  Extending logical volume roth_lv to 1.00 TB
  Logical volume roth_lv successfully resized
[root@roth-01 ../src/redhat/RPMS/x86_64]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       71G   58G  9.1G  87% /
/dev/sda1              99M   94M  220K 100% /boot
tmpfs                 279M     0  279M   0% /dev/shm
/dev/mapper/roth_vg-roth_lv
                      5.0G  131M  4.9G   3% /mnt/gfs2
[root@roth-01 ../src/redhat/RPMS/x86_64]# gfs2_grow /mnt/gfs2 ; df -h
FS: Mount Point: /mnt/gfs2
FS: Device:      /dev/mapper/roth_vg-roth_lv
FS: Size:        5242878 (0x4ffffe)
FS: RG size:     262140 (0x3fffc)
DEV: Size:       269746176 (0x10140000)
The file system grew by 258304MB.
gfs2_grow complete.
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       71G   58G  9.1G  87% /
/dev/sda1              99M   94M  220K 100% /boot
tmpfs                 279M     0  279M   0% /dev/shm
/dev/mapper/roth_vg-roth_lv
                      257G  131M  257G   1% /mnt/gfs2
[root@roth-01 ../src/redhat/RPMS/x86_64]# 

So I extended the partition size by 1TB, but gfs2_grow only allocated
enough resource groups for one fourth of that, or 256G.

I debugged that problem and will post a patch shortly.  We will
definitely want to z-stream this one for 5.3.z.
Comment 4 Robert Peterson 2009-01-21 08:54:54 EST
Created attachment 329604 [details]
Patch to fix the problem

This patch was tested on system roth-01.
Comment 5 Robert Peterson 2009-01-21 08:57:32 EST
The same commands/output from comment #3, but with the patch applied:

[root@roth-01 ../bob/cluster/gfs2/mkfs]# lvcreate --name roth_lv -L 5G /dev/roth_vg
  Logical volume "roth_lv" created
[root@roth-01 ../bob/cluster/gfs2/mkfs]# mkfs.gfs2 -O -b1024 -t bobs_roth:test_gfs -p lock_dlm -j 1 /dev/roth_vg/roth_lv
Device:                    /dev/roth_vg/roth_lv
Blocksize:                 1024
Device Size                5.00 GB (5242880 blocks)
Filesystem Size:           5.00 GB (5242878 blocks)
Journals:                  1
Resource Groups:           20
Locking Protocol:          "lock_dlm"
Lock Table:                "bobs_roth:test_gfs"

[root@roth-01 ../bob/cluster/gfs2/mkfs]# mount -tgfs2 /dev/roth_vg/roth_lv /mnt/gfs2
[root@roth-01 ../bob/cluster/gfs2/mkfs]# /usr/sbin/lvresize -L +1T /dev/roth_vg/roth_lv
  Extending logical volume roth_lv to 1.00 TB
  Logical volume roth_lv successfully resized
[root@roth-01 ../bob/cluster/gfs2/mkfs]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       71G   58G  9.1G  87% /
/dev/sda1              99M   94M  220K 100% /boot
tmpfs                 279M     0  279M   0% /dev/shm
/dev/mapper/roth_vg-roth_lv
                      5.0G  131M  4.9G   3% /mnt/gfs2
[root@roth-01 ../bob/cluster/gfs2/mkfs]# ./gfs2_grow /mnt/gfs2 ; df -h
FS: Mount Point: /mnt/gfs2
FS: Device:      /dev/mapper/roth_vg-roth_lv
FS: Size:        5242878 (0x4ffffe)
FS: RG size:     262140 (0x3fffc)
DEV: Size:       1078984704 (0x40500000)
The file system grew by 1048576MB.
gfs2_grow complete.
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       71G   58G  9.1G  87% /
/dev/sda1              99M   94M  220K 100% /boot
tmpfs                 279M     0  279M   0% /dev/shm
/dev/mapper/roth_vg-roth_lv
                      1.1T  131M  1.1T   1% /mnt/gfs2
[root@roth-01 ../bob/cluster/gfs2/mkfs]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       71G   58G  9.1G  87% /
/dev/sda1              99M   94M  220K 100% /boot
tmpfs                 279M     0  279M   0% /dev/shm
/dev/mapper/roth_vg-roth_lv
                      1.1T  131M  1.1T   1% /mnt/gfs2
[root@roth-01 ../bob/cluster/gfs2/mkfs]#
Comment 6 Nate Straz 2009-01-21 09:50:06 EST
I re-ran our growfs test script and was able to reproduce this with gfs2-utils-0.1.53-1.el5.  The test script does multiple file system grows in a row.  In this case the second grow did not immediately return the new size.

Starting io load to filesystems
adding /dev/sdb10 to VG growfs on dash-03
lvextend -l +50%FREE growfs/gfs1 on dash-02
growing gfs1 on dash-03
verifying grow
lvextend -l +50%FREE growfs/gfs2 on dash-03
growing gfs2 on dash-01
verifying grow
size of gfs /mnt/gfs2 did not increase,
was: 265702, is now: 265702

To Reproduce:

1. /usr/tests/sts-rhel5.3/gfs/bin/growfs -2 -i 1
Comment 7 Robert Peterson 2009-01-21 11:11:37 EST
Created attachment 329621 [details]
Patch for the block size problem

There are two problems to be fixed: (1) The non-default block size
problem, and (2) The fact that OTHER NODES do not see changes made
by gfs2_grow until some time after gfs2_grow ends, due to fast_statfs.

The previously posted patch does not fix problem 2.
This patch fixes problem 1 only.  After some discussion on irc,
we decided that problem 2 should be fixed rather than documented
around, and that the solution may very well involve the gfs2 kernel
module.  So it's likely we'll need a kernel bug record as well.

Steve's suggestion was to make statfs check the rindex to see if it
has changed.  "In the unlikely event that it has changed, we go back
to slow statfs [code path] for just the one call."

Nate also discovered that gfs's fast_statfs feature has the same problem
but it's apparently worse: it never re-syncs on the other nodes.
If fast_statfs is not used for gfs, the file system size is cluster
coherent (i.e. the bug does not recreate on gfs1 unless fast_statfs is
used).  I think we've know this is broken for a very long time.
I'm not sure it's easy to fix for gfs, and I'm not sure it's worth it.
But we do need to fix gfs2.

When we come up with a solution for problem 2, I'll likely use this
bugzilla to fix that, and open another for problem 1.
Comment 8 Robert Peterson 2009-01-29 10:05:01 EST
Even though the symptoms are the same, there is a user space problem
and another problem that will likely be fixed in the gfs2 kernel code.
My intent is now to fix problem #1 (user space) described in comment #7
using this bug record.  I cloned this record to bug #482756 so we can
do the kernel portion there.  This fix can be shipped independently
though.
Comment 9 Robert Peterson 2009-01-29 18:44:33 EST
Incidentally, the patch was pushed to the master branch of the
gfs2-utils git repo, and the STABLE2 and STABLE3 branches of the
cluster git repo.
Comment 10 Robert Peterson 2009-01-29 19:00:54 EST
This patch is now pushed to the RHEL5 branch of the cluster git repo
for inclusion into 5.4.  It was tested on roth-01.  So I'm changing
the status to MODIFIED.

This problem is serious enough that I think we need to z-stream it.
I'm bumping the priority and severity to reflect that.  I'm also
adding Ben Kahn and Chris Feist to the cc list toward that end.
Comment 13 Nate Straz 2009-02-13 16:58:54 EST
How close should gfs2_grow get to filling the block device?  In testing with gfs2-utils-0.1.53-1.el5_3.1 I was still about 680MB short of the end of the block device and the RG size was 256MB.

growing gfs1 on z1
FS: Mount Point: /mnt/gfs1
FS: Device:      /dev/mapper/growfs-gfs1
FS: Size:        7139327 (0x6cefff)
FS: RG size:     254973 (0x3e3fd)
DEV: Size:       14282752 (0xd9f000)
The file system grew by 6976MB.
gfs2_grow complete.
...
File system didn't grow to fill volume
fs = 13946, lv = 14625.54

The last two numbers are both in MB.
Comment 14 Robert Peterson 2009-02-16 09:29:20 EST
Unlike gfs, gfs2_grow adds new space on even resource group (RG)
boundaries.  That has the advantage that the rindex file can be
rebuilt in gfs2_fsck using simple block calculations.  The
disadvantage is that gfs2_grow may leave some space at the end of
the device that is unusable unless/until the device is extended
to the next RG boundary.  The "free space" returned by df will
show the space minus the blocks used by the new RGs and their
bitmaps, so the only way to tell whether there's a problem is for
me to examine file system with gfs2_edit.  Given the rg size shown
above, if there really is 680MB of space unaccounted for in the
file system, including the RG and bitmap space, then that would be
a bug.  I would expect less than 256MB after the last RG and its
bitmaps.  But again, I'd want to take a look to see how everything
was laid out.
Comment 15 Robert Peterson 2009-02-16 13:37:47 EST
Regarding comment #13:  I examined Nate's gfs2 file system with
gfs2_edit and determined that gfs2_grow apparently did the right thing.
Here is exactly what I did:

From gfs2_edit I first determined that the file system block size is 1K.
Then I got the device size (in terms of the 1K block size):

[root@z1 tool]# gfs2_edit -p size /dev/growfs/gfs1 | head -1
Device size: 14282752 (0xd9f000)

So the actual device size is 0xd9f000 blocks of 1K.
Next, I printed out the last two entries of the rindex file:

root@z1 tool]# gfs2_edit -p rindex /dev/growfs/gfs1 | tail -13
RG #54
  ri_addr               13768626            0xd217b2
  ri_length             64                  0x40
  ri_data0              13768690            0xd217f2
  ri_data               254908              0x3e3bc
  ri_bitbytes           63727               0xf8ef
RG #55
  ri_addr               14023599            0xd5fbaf
  ri_length             64                  0x40
  ri_data0              14023663            0xd5fbef
  ri_data               254908              0x3e3bc
  ri_bitbytes           63727               0xf8ef

Then I did the math.  The space between RGs is 0xd5fbaf - 0xd217b2
which equals: 0x3e3fd.  So if gfs2_grow wanted to add another
RG to the file system, we would get: 0xd5fbaf + 0x3e3fd = 0xddc3a9.
That value is beyond the end of the device, 0xd9f000, from step 1.
Therefore, gfs2_grow could not possibly have added another full
RG after the last one.  Note that in this particular case, the
bitmaps take up 64 blocks of 1K each (as shown in ri_length) which
means free space in df will be missing that many blocks for each RG
due to the space reserved for bitmaps.
Comment 16 Robert Peterson 2009-03-31 10:12:51 EDT
*** Bug 492932 has been marked as a duplicate of this bug. ***
Comment 17 Robert Peterson 2009-03-31 10:17:52 EDT
I'm changing the summary of this bugzilla.  In reality, the
miscalculations cause one of two symptoms: (1) The file system
grows by too little, or (2) The file system can't grow at all
when it should.  The symptoms are more likely to occur when the
file system has block size smaller than the default of 4K.  But
the error can occur even with 4K blocks if the file system is small
enough.
Comment 18 Robert Peterson 2009-03-31 18:14:09 EDT
*** Bug 491951 has been marked as a duplicate of this bug. ***
Comment 21 Nate Straz 2009-06-26 15:31:40 EDT
I have not seen this issue during GFS2 growfs testing.  Verified against gfs2-utils-0.1.58-1.el5 and kernel-2.6.18-154.el5.
Comment 23 errata-xmlrpc 2009-09-02 07:01:39 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1337.html

Note You need to log in before you can comment on or make changes to this bug.