Bug 1498068

Summary: Better mkfs.gfs2 defaults and bounds checking
Product: Red Hat Enterprise Linux 7 Reporter: Andreas Gruenbacher <agruenba>
Component: gfs2-utilsAssignee: Andrew Price <anprice>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: low Docs Contact:
Priority: unspecified    
Version: 7.4CC: anprice, cluster-maint, coughlan, gfs2-maint, nstraz, rhandlin
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gfs2-utils-3.1.10-7.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 11:37:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Gruenbacher 2017-10-03 12:38:38 UTC
When mkfs.gfs2 is used to create very small filesystems, it first creates the base filesystem structure before it tries to add the journals.  The default journal size is 128 MiB, no matter how small the overall filesystem size.

For example, "mkfs.gfs2 -O -p lock_nolock /dev/vdc $((128 * 1024 * 1024 / 4096))" fails with:
  "Failed to create resource group index entry: No space left on device"

"mkfs.gfs2 -O -p lock_nolock /dev/vdc $((129 * 1024 * 1024 / 4096))" fails without a proper error message:
  "get_file_buf"

"mkfs.gfs2 -O -p lock_nolock /dev/vdc $((130 * 1024 * 1024 / 4096))" succeeds, but the resulting filesystem is pretty useless with its 128 MiB journal.

Instead, mkfs.gfs2 should shrink the total size occupied by journals to a sane proportion of the total filesystem size (for example, at most 1/8 of the space used for journals, down to the minimum journal size of 8 MiB).

In addition, mkfs.gfs2 should reject creating filesystems with parameters that don't make sense instead of trying to create the filesystem and failing.

It probably isn't useful even for debugging purposes to create a filesystem where more space is used up by journals than is left for data, so that might be an acceptable sanity check.

Comment 2 Andrew Price 2017-10-03 12:50:36 UTC
Not strictly a duplicate so I'll leave it open but this will be fixed at the same time as bug 1158142.

Comment 4 Nate Straz 2017-10-25 21:15:47 UTC
Can you state explicitly what restrictions and adjustments you're making per device size so we can verify it?

Comment 5 Andreas Gruenbacher 2017-10-30 11:48:45 UTC
I would just like mkfs.gfs2 to behave "reasonably" for small filesystem sizes.  All the filesystems that xfstests creates are "local" (-p lock_nolock) and with a single journal file.  mkfs.gfs2 should automatically adjust the journal size and the number of resource groups for all kinds of filesystems, not only "local" ones though, I believe.

The smallest filesystem that xfstests uses seems to be 32 MiB.  I'm currently using the following hack in _scratch_mkfs_sized to fix all tests except the ones which call mkfs.gfs2 for a device mapper device without specifying the filesystem size on the mkfs.gfs2 command line:

# mkfs.gfs2 doesn't automatically shrink journal files on small
# filesystems, so the journal files may end up being bigger than the
# filesystem, which will cause mkfs.gfs2 to fail.  Until that's fixed,
# shrink the journal size to at most one eigth of the filesystem and at
# least 8 MiB, the minimum size allowed.
MIN_JOURNAL_SIZE=8
DEFAULT_JOURNAL_SIZE=128
if (( fssize/8 / (1024*1024) < DEFAULT_JOURNAL_SIZE )); then
    (( JOURNAL_SIZE = fssize/8 / (1024*1024) ))
    (( JOURNAL_SIZE >= MIN_JOURNAL_SIZE )) || JOURNAL_SIZE=$MIN_JOURNAL_SIZE
    MKFS_OPTIONS="-J $JOURNAL_SIZE $MKFS_OPTIONS"
fi

Comment 7 Andrew Price 2018-02-08 15:40:58 UTC
(In reply to Andreas Gruenbacher from comment #5)
> The smallest filesystem that xfstests uses seems to be 32 MiB.

> # filesystem, which will cause mkfs.gfs2 to fail.  Until that's fixed,
> # shrink the journal size to at most one eigth of the filesystem and at
> # least 8 MiB, the minimum size allowed.

This would give us a minimum fs size of roughly 64M rather than 32, but I'll aim for a 1/4 so that we can get down to 32M anyway. Which xfstests should I run to check the changes meet requirements?

Comment 8 Andreas Gruenbacher 2018-02-08 16:02:48 UTC
I know that generic/108 fails when trying to create a 100M filesystem on top of LVM.  I haven't documented which tests the _scratch_mkfs hack described in comment 5 fixes.

For testing this properly, the _scratch_mkfs hack should be removed and most of the tests in the auto group should still succeed.  The problem is that some of the tests currently hang and some fail either occasionally or consistently, so it will probably be easiest for me to regression test this for you for now.

Comment 9 Andrew Price 2018-02-12 20:30:00 UTC
I've submitted a patch upstream to scale down the journal size and there's a RHEL7 scratch build here if you'd like to give it a spin with xfstests: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=15267411

Comment 10 Andrew Price 2018-02-13 20:48:47 UTC
I've sent a v2 patch upstream with more sensible defaults. Here's a RHEL7 build for that one:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=15277790

Comment 11 Andreas Gruenbacher 2018-02-14 16:03:09 UTC
This last version seems to work much better with xfstests; one additional gfs2 kernel bug discovered so far.  Thanks!

Comment 14 Andrew Price 2018-04-12 10:34:58 UTC
*** Bug 1158142 has been marked as a duplicate of this bug. ***

Comment 15 Andrew Price 2018-04-15 17:43:13 UTC
Another test case to check is a segfault while building too-small filesystems:

# mkfs.gfs2 -O -p lock_nolock testvol 32860
It appears to contain an existing filesystem (gfs2)
This will destroy any data on testvol
Adding journals: Done 
Building resource groups: Segmentation fault


With the patch it fails more gracefully when there's not enough space:

# mkfs.gfs2 -O -p lock_nolock testvol 4096
It appears to contain an existing filesystem (gfs2)
This will destroy any data on testvol
Adding journals: Done 
Building resource groups: Done 
Creating quota file: Done
Writing superblock and syncing: Done
[...]

# mkfs.gfs2 -O -p lock_nolock testvol 4095
It appears to contain an existing filesystem (gfs2)
gfs2 will not fit on this device.

Comment 16 Nate Straz 2018-04-20 19:14:14 UTC
Is there a minimum amount of free space required after mkfs.gfs2? i.e. Could I create a file system with one block of free space and the rest journals? Is there some proportion of free space to device size required?

Is there a minimum number of resource groups that should be enforced?

Looking back through the bug I can see these criteria:

 - Free space should not be less than the total journal space
 - Journals should not be more that 1/4 of the device size
 - mkfs.gfs2 should fail gracefully and not segfault when mkfs will not succeed

Comment 17 Andrew Price 2018-04-23 10:17:12 UTC
(In reply to Nate Straz from comment #16)
> Is there a minimum amount of free space required after mkfs.gfs2? i.e. Could
> I create a file system with one block of free space and the rest journals?
> Is there some proportion of free space to device size required?
> 
> Is there a minimum number of resource groups that should be enforced?
> 
> Looking back through the bug I can see these criteria:
> 
>  - Free space should not be less than the total journal space

It's better to frame that as "journals should not take up more than 1/2 of the device" as in reality the other metafs files will take up a little more space but the journals always occupy much more space than them.

>  - Journals should not be more that 1/4 of the device size

1/2 was finally chosen after trying 1/4, as 1/4 combined with the minimum journal size made the minimum fs size too large. That's the total journal size, so with 2 journals of minimum size 8M, the minimum fs size will be around 32M.

>  - mkfs.gfs2 should fail gracefully and not segfault when mkfs will not
> succeed

Yes.

Comment 23 Nate Straz 2018-09-18 14:53:23 UTC
Added a new test scenario to gfs_fsck_stress and verified against gfs2-utils-3.1.10-8.el7.x86_64


SCENARIO - [minimumsize]
Test very small file system sizes
Creating 32M LV minimum on host-114
WARNING: gfs2 signature detected on /dev/fsck/minimum at offset 65536. Wipe it? [y/n]: [n]
  Aborted wiping of gfs2.
  1 existing signature left on the device.
Creating file system on /dev/fsck/minimum with options '-p lock_nolock -j 1' on host-114
It appears to contain an existing filesystem (gfs2)
/dev/fsck/minimum is a symbolic link to /dev/dm-2
This will destroy any data on /dev/dm-2
Discarding device contents (may take a while on large devices): Done
Adding journals: Done
Building resource groups: Done
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/fsck/minimum
Block size:                4096
Device size:               0.03 GB (8192 blocks)
Filesystem size:           0.03 GB (8191 blocks)
Journals:                  1
Journal size:              8MB
Resource groups:           2
Locking protocol:          "lock_nolock"
Lock table:                ""
UUID:                      8cc63124-6bee-4e75-85f4-a42aeb34d182
Creating file system on /dev/fsck/minimum with options '-p lock_nolock -j 2' on host-114
It appears to contain an existing filesystem (gfs2)
/dev/fsck/minimum is a symbolic link to /dev/dm-2
This will destroy any data on /dev/dm-2
Discarding device contents (may take a while on large devices): Done
Adding journals: Done
Building resource groups: Done
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/fsck/minimum
Block size:                4096
Device size:               0.03 GB (8192 blocks)
Filesystem size:           0.03 GB (8188 blocks)
Journals:                  2
Journal size:              8MB
Resource groups:           3
Locking protocol:          "lock_nolock"
Lock table:                ""
UUID:                      b5b4ebf9-a4af-4b70-b604-41d5529c93be
Creating file system on /dev/fsck/minimum with options '-p lock_nolock -j 3' on host-114
It appears to contain an existing filesystem (gfs2)
gfs2 will not fit on this device.
Creating file system on /dev/fsck/minimum with options '-p lock_nolock -j 1 -J 32' on host-114
It appears to contain an existing filesystem (gfs2)
gfs2 will not fit on this device.
Maximum size for 1 journals on this device is 16MB.
Creating file system on /dev/fsck/minimum with options '-p lock_nolock -j 1 -J 16' on host-114
It appears to contain an existing filesystem (gfs2)
/dev/fsck/minimum is a symbolic link to /dev/dm-2
This will destroy any data on /dev/dm-2
Discarding device contents (may take a while on large devices): Done
Adding journals: Done
Building resource groups: Done
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/fsck/minimum
Block size:                4096
Device size:               0.03 GB (8192 blocks)
Filesystem size:           0.03 GB (8191 blocks)
Journals:                  1
Journal size:              16MB
Resource groups:           2
Locking protocol:          "lock_nolock"
Lock table:                ""
UUID:                      52c62067-0bfe-4c15-bf80-5dac0ba15f55
Creating file system on /dev/fsck/minimum with options '-p lock_dlm -j 2 -J 32 -t STSRHTS31141:minimum' on host-114
It appears to contain an existing filesystem (gfs2)
gfs2 will not fit on this device.
Maximum size for 2 journals on this device is 8MB.
Creating file system on /dev/fsck/minimum with options '-p lock_dlm -j 3 -J 8 -t STSRHTS31141:minimum' on host-114
It appears to contain an existing filesystem (gfs2)
gfs2 will not fit on this device.
Creating file system on /dev/fsck/minimum with options '-p lock_dlm -j 2 -J 8 -t STSRHTS31141:minimum' on host-114
It appears to contain an existing filesystem (gfs2)
/dev/fsck/minimum is a symbolic link to /dev/dm-2
This will destroy any data on /dev/dm-2
Discarding device contents (may take a while on large devices): Done
Adding journals: Done
Building resource groups: Done
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/fsck/minimum
Block size:                4096
Device size:               0.03 GB (8192 blocks)
Filesystem size:           0.03 GB (8188 blocks)
Journals:                  2
Journal size:              8MB
Resource groups:           3
Locking protocol:          "lock_dlm"
Lock table:                "STSRHTS31141:minimum"
UUID:                      41ba3dbe-9d49-48d7-a654-671cd1680e1e
Mounting gfs2 /dev/fsck/minimum on host-114 with opts ''
Mounting gfs2 /dev/fsck/minimum on host-115 with opts ''
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/fsck-minimum   32M   19M   14M  57% /mnt/fsck
Unmounting /mnt/fsck on host-114
Unmounting /mnt/fsck on host-115
Starting fsck of /dev/fsck/minimum on host-114
fsck output in /tmp/gfs_fsck_stress.25159/1.minimumsize/1.fsck-host-114.log
fsck.gfs2 of /dev/fsck/minimum on host-114 took 0 seconds
Removing LV minimum on host-114

Comment 25 errata-xmlrpc 2018-10-30 11:37:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3272