637972 – GFS2: Not enough space reserved in gfs2_write_begin and possibly elsewhere.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 637972 - GFS2: Not enough space reserved in gfs2_write_begin and possibly elsewhere.

Summary: GFS2: Not enough space reserved in gfs2_write_begin and possibly elsewhere.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.1
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Ben Marzinski
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:	626686
Blocks:	637970
TreeView+	depends on / blocked

Reported:	2010-09-27 21:31 UTC by Ben Marzinski
Modified:	2011-05-19 12:36 UTC (History)
CC List:	10 users (show)
Fixed In Version:	kernel-2.6.32-83.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:	626686
Environment:
Last Closed:	2011-05-19 12:36:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Patch that fixes the issue (5.42 KB, patch) 2010-09-28 16:21 UTC, Ben Marzinski	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0542	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update	2011-05-19 11:58:07 UTC

Description Ben Marzinski 2010-09-27 21:31:25 UTC

+++ This bug was initially created as a clone of Bug #626686 +++

For a non journaled write, gfs2 reserves enough space to for all the indirect blocks it may possibly need to allocate or modify, plus a block for the dinode, a block for the local statfs change file, a block for quota changes, and an block if there are no journalled datablocks.

By picking a specific set of writes, I can get GFS2 to modify: all the indirect blocks that it reserves, the dinode, the local statfs change file, a resource group header block, and a resource group bit block.

This happens to come out the same number.  However, if it is possible that gfs2 might need to write to the quota file during a transaction, then it will fail. Also, if gfs2 can't allocate all the blocks from one resource group bit block, it will fail.

The solution seems to be to make gfs2 additionally reserve either as many blocks as it might allocate (both direct and indirect) or as many blocks as there are bit blocks in the resource group, whichever is fewer.  This guarantees that gfs2 will be able to allocate all of the blocks for the write, even if that requires modifying the maximum number of resource group bit blocks possible.

To make gfs2 use up at least all of the reserved blocks, I made a 1024 byte filesystem, mounted it, and then ran

# dd if=/dev/zero of=/mnt/test/foo bs=512 count=1
# dd if=/dev/zero of=/mnt/test/foo bs=4096 seek=238418579101562 count=1

This will force gfs2 to allocate all of the indirect blocks that it reserved.

--- Additional comment from bmarzins on 2010-09-27 17:19:43 EDT ---

Created attachment 450040 [details]
latest patch to fix issue

This is the latest patch I submitted to cluster-devel to fix this issue.  It adds an inline function, gfs2_rg_blocks() that returns either the number of allocated blocks plus one for the rg header, or the total number of blocks in the rg, whichever is less.  This is used by functions that need to allocate blocks in transactions to makes sure and reserve enough blocks to deal with the resource groups.

Comment 2 Ben Marzinski 2010-09-28 16:21:26 UTC

Created attachment 450237 [details]
Patch that fixes the issue

This patch adds an inline function, gfs2_rg_blocks() that returns either the number of allocated blocks plus one for the rg header, or the total number of blocks in the rg, whichever is less.  This is used by functions that need to allocate blocks in transactions to makes sure and reserve enough blocks to deal with the resource groups.

Comment 3 RHEL Program Management 2010-10-05 01:51:15 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 4 Aristeu Rozanski 2010-11-17 19:45:18 UTC

Patch(es) available on kernel-2.6.32-83.el6

Comment 6 Nate Straz 2010-11-30 22:19:04 UTC

Is there a procedure for reproducing this bug?

Comment 7 Ben Marzinski 2010-12-01 04:29:06 UTC

If you keep running a bunch of

# dd if=/dev/zero of=<gfs2_file> bs=512 count=1
# dd if=/dev/zero of=<gfs2_file> bs=4096 seek=238418579101562 count=1


to different files on a 1024 byte gfs2 filesystem, you should eventually get to a point where you need to allocate from more than one resource group bit block, and you should hit this.  I never did it myself.  I just noticed that it was wrong while adding fallocate support.

Comment 10 errata-xmlrpc 2011-05-19 12:36:43 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html

Note You need to log in before you can comment on or make changes to this bug.