RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1153316 - GFS2: fsck.gfs2 requires too much memory on large file systems
Summary: GFS2: fsck.gfs2 requires too much memory on large file systems
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: gfs2-utils
Version: 7.1
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Robert Peterson
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1184482
Blocks: 1111393 1165285 1268045 1497636
TreeView+ depends on / blocked
 
Reported: 2014-10-15 18:32 UTC by Nate Straz
Modified: 2017-10-02 09:59 UTC (History)
5 users (show)

Fixed In Version: gfs2-utils-3.1.8-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1268045 (view as bug list)
Environment:
Last Closed: 2015-11-19 03:52:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch 1 of 3. Too-long filename check patch (1.61 KB, patch)
2014-12-23 21:14 UTC, Robert Peterson
no flags Details | Diff
Patch 2 of 3: print out block number in pass3 (1.35 KB, patch)
2014-12-23 21:15 UTC, Robert Peterson
no flags Details | Diff
Patch 3 of 3: move to small blockmap (74.08 KB, patch)
2014-12-23 21:17 UTC, Robert Peterson
no flags Details | Diff
Tarball of the most recent patches (24.91 KB, application/octet-stream)
2015-02-12 20:00 UTC, Robert Peterson
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2178 0 normal SHIPPED_LIVE gfs2-utils bug fix and enhancement update 2015-11-19 07:52:21 UTC

Description Nate Straz 2014-10-15 18:32:41 UTC
Description of problem:

Trying to run fsck.gfs2 on a 250TB LV failed on a system with 24GB RAM.

[root@buzz-01 ~]# fsck.gfs2 -y /dev/mapper/buzzez-nodata
Initializing fsck
Validating Resource Group index.
Level 1 rgrp check: Checking if all rgrp and rindex values are good.
(level 1 passed)
This system doesn't have enough memory and swap space to fsck this file system.
Additional memory needed is approximately: 32012MB
Please increase your swap space by that amount and run gfs2_fsck again.

Version-Release number of selected component (if applicable):
gfs2-utils-3.1.7-1.el7.x86_64

How reproducible:
Easily

Steps to Reproduce:
1. Create a 250TB lun (thinp is ok)
2. mkfs.gfs2 -O -p lock_nolock $dev
3. fsck.gfs2 -y $dev

Actual results:

See above

Expected results:
fsck.gfs2 should not require large amounts of memory as the file system size grows.

Additional info:

Memory usage on a 100TB file system is about 16GB.

[root@buzz-01 ~]# lvcreate -L 100T -n big buzzez
WARNING: gfs2 signature detected on /dev/buzzez/big at offset 65536. Wipe it? [y/n] y
  Wiping gfs2 signature on /dev/buzzez/big.
  Logical volume "big" created
[root@buzz-01 ~]# lvs
  LV   VG           Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  big  buzzez       -wi-a----- 100.00t
[root@buzz-01 ~]# /usr/bin/time mkfs.gfs2 -j 5 -p lock_dlm -t buzzez:big /dev/buzzez/big -O
/dev/buzzez/big is a symbolic link to /dev/dm-7
This will destroy any data on /dev/dm-7
Device:                    /dev/buzzez/big
Block size:                4096
Device size:               102400.00 GB (26843545600 blocks)
Filesystem size:           102398.62 GB (26843185521 blocks)
Journals:                  5
Resource groups:           51204
Locking protocol:          "lock_dlm"
Lock table:                "buzzez:big"
UUID:                      4968d7e3-4018-42b2-9cfd-66026639ab87
22.24user 15.31system 4:21.11elapsed 14%CPU (0avgtext+0avgdata 55632maxresident)k
11176inputs+14850296outputs (1major+58098minor)pagefaults 0swaps
[root@buzz-01 ~]# /usr/bin/time fsck.gfs2 -y /dev/buzzez/big
Initializing fsck
Validating Resource Group index.
Level 1 rgrp check: Checking if all rgrp and rindex values are good.
(level 1 passed)
Starting pass1
pass1 completed in 3.780s
Starting pass1b
pass1b completed in 0.009s
Starting pass1c
pass1c completed in 0.000s
Starting pass2
pass2 completed in 3m4.543s
Starting pass3
pass3 completed in 0.000s
Starting pass4
pass4 completed in 0.000s
Starting pass5
pass5 completed in 4m20.591s
Starting check_statfs
check_statfs completed in 0.004s
gfs2_fsck complete
463.10user 15.21system 8:25.53elapsed 94%CPU (0avgtext+0avgdata 16310096maxresident)k
13542696inputs+272outputs (1major+7572320minor)pagefaults 0swaps

Comment 3 Nate Straz 2014-11-24 16:37:06 UTC
I finally got a fsck.gfs2 run in on an empty 250TB file system.

[root@mckinley-04 fsck]# /usr/bin/time fsck.gfs2 -y /dev/250TB/gfs2
Initializing fsck
Validating Resource Group index.
Level 1 rgrp check: Checking if all rgrp and rindex values are good.
(level 1 passed)
Starting pass1
pass1 completed in 13.524s
Starting pass1b
pass1b completed in 0.000s
Starting pass1c
pass1c completed in 0.000s
Starting pass2
pass2 completed in 13m38.589s
Starting pass3
pass3 completed in 0.000s
Starting pass4
pass4 completed in 0.000s
Starting pass5
pass5 completed in 19m45.703s
Starting check_statfs
check_statfs completed in 0.121s
gfs2_fsck complete
1628.86user 510.71system 36:12.72elapsed 98%CPU (0avgtext+0avgdata 17819152maxresident)k
33820072inputs+272outputs (0major+20585653minor)pagefaults 0swaps

Snapshot from top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
90648 root      20   0 47.850g 0.017t    872 R 100.0 13.6   7:30.55 fsck.gfs2

Comment 4 Robert Peterson 2014-12-23 21:14:18 UTC
Created attachment 972515 [details]
Patch 1 of 3. Too-long filename check patch

This is the first of 3 patches for bz#1153316. Its purpose is
to check directory entries for file names that are too long to
physically fit within the contraints of the block.

Comment 5 Robert Peterson 2014-12-23 21:15:25 UTC
Created attachment 972516 [details]
Patch 2 of 3: print out block number in pass3

This patch changes pass3 so that it prints out the block number
when it finds a bad directory.

Comment 6 Robert Peterson 2014-12-23 21:17:37 UTC
Created attachment 972518 [details]
Patch 3 of 3: move to small blockmap

This patch change fsck.gfs2 so that it uses a blockmap that
has a direct 1:1 correlation with the rgrp bitmap. So instead
of using 8 bits for the block state, we only use 2 bits.
That reduces the memory to 1/4 of its previous requirements.
This patch is still being tested and is likely to have bugs
that still need fixing. So far, it passes the hardest tests
though.

Comment 7 Robert Peterson 2015-02-12 20:00:24 UTC
Created attachment 991126 [details]
Tarball of the most recent patches

This tarball contains all the latest/greatest RHEL7 patches for
this bug. All of them have gone upstream except for the last two.
They've all been extensively tested on system gfs-i24c-01.lab.bos.

0001-fsck.gfs2-Change-basic-dentry-checks-for-too-long-of.patch
0002-fsck.gfs2-Print-out-block-number-when-pass3-finds-a-.patch
0003-fsck.gfs2-Adjust-when-hash-table-is-doubled.patch
0004-fsck.gfs2-Revise-undo-processing.patch
0005-fsck.gfs2-remove-duplicate-designation-during-undo.patch
0006-fsck.gfs2-Fix-a-use-after-free-in-pass2.patch
0007-fsck.gfs2-fix-double-free-bug.patch
0008-fsck.gfs2-rgrp-block-count-reform.patch
0009-fsck.gfs2-Change-block_map-to-match-bitmap.patch
0010-fsck.gfs2-Fix-journal-sequence-number-reporting-prob.patch

Comment 8 Robert Peterson 2015-02-13 14:45:44 UTC
All ten of these patches now appear in the master branch of the
upstream gfs2-utils git repo. So it's just a matter of pushing
them to the RHEL7 branch at this point.

Comment 15 Nate Straz 2015-08-24 21:07:02 UTC
fsck.gfs2 memory usage from gfs2-utils-3.1.8-4.el7.x86_64 on an empty file system.

32MB Resource Groups								
FS Size	max resident in GB
1T	0.199
10T	1.985
100T	19.839
250T	49.596
								
256MB Resource Groups								
FS Size	max resident in GB
1T	0.143
10T	1.434
100T	14.343
250T	35.858
								
2048MB Resource Groups				`				
FS Size	max resident in GB
1T	0.128
10T	1.284
100T	12.848
250T	32.121

Comment 16 Nate Straz 2015-09-21 17:43:02 UTC
Testing on gfs2-utils-3.1.8-4.el7.x86_64 with an 80% full file system ran into a runtime scalability issue in pass1c (bz 1257625) above 1TB so I stopped after 16TB.

Size   Elapsed  Max resident
 16G     00:41    0.00GB
 32G     01:22    0.01GB
 64G     03:25    0.02GB
128G     06:09    0.03GB
256G     11:35    0.07GB
512G     25:53    0.14GB
  1T   2:17:17    0.27GB
  2T   6:34:12    0.54GB
  4T  16:23:25    1.07GB
  8T  36:47:47    2.13GB
 16T  66:29:01    4.26GB


Bob handed me a test binary which I used to keep going and I was able to get up to 64TB.

Size   Elapsed  Max resident
512G   0:28:56   0.10GB
  1T   0:58:18   0.20GB
  2T   2:11:11   0.40GB
  4T   4:23:17   0.80GB
  8T   9:09:03   1.61GB
 16T  16:37:00   3.20GB
 32T  34:46:32   6.39GB
 64T  81:03:57  12.78GB

At the current trend I estimate that a 256TB file system to take about two weeks and use 52GB of RAM.

Comment 17 Nate Straz 2015-10-01 16:45:38 UTC
250TB empty GFS2 file system w/ default 256MB RGs

gfs2-utils-3.1.7-6.el7.x86_64  maxresident 53999408k
gfs2-utils-3.1.8-6.el7.x86_64  maxresident 37615632k

Definitely an improvement on empty file systems, but there is still a lot more work to do.  I will clone this bug to continue the work in the next release.

Comment 19 errata-xmlrpc 2015-11-19 03:52:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2178.html


Note You need to log in before you can comment on or make changes to this bug.