434736 – GFS2: gfs2_adjust_quota has broken unstuffing code

Bug 434736 - GFS2: gfs2_adjust_quota has broken unstuffing code

Summary: GFS2: gfs2_adjust_quota has broken unstuffing code

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.2
Hardware:	All
OS:	Linux
Priority:	high
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Don Zickus
QA Contact:	GFS Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	307091 435075
TreeView+	depends on / blocked

Reported:	2008-02-25 04:03 UTC by Steve Whitehouse
Modified:	2008-05-21 15:10 UTC (History)
CC List:	4 users (show)
Fixed In Version:	RHBA-2008-0314
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-05-21 15:10:39 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
patch to correct lock ordering in gfs2_adjust_quota (2.28 KB, patch) 2008-02-28 19:49 UTC, Abhijith Das	no flags	Details \| Diff
Patch to add correct number of revokes (566 bytes, patch) 2008-03-03 08:57 UTC, Steve Whitehouse	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2008:0314	0	normal	SHIPPED_LIVE	Updated kernel packages for Red Hat Enterprise Linux 5.2	2008-05-20 18:43:34 UTC

Description Steve Whitehouse 2008-02-25 04:03:19 UTC

Description of problem:


If the unstuffing code in gfs2_adjust_quota ever runs it will likely cause a
crash due to the incorrect locking order and also the fact that calling
gfs2_alloc_get recursively isn't allowed. Also gfs2_adjust_quota gets called
under a transaction and thus gfs2_inplace_reserve must not be called (since it
locks rgrps, and the rgrps must be locked before the transaction is started).

The simple solution is to just add a block to the reservation at the higher
layer. This can be done unconditionally, even if an unstuff isn't needed since
it will be released back to the rgrp if its not allocated during the transaction.

Fixing this is required by the next step of the tree walking bz.

Comment 1 RHEL Program Management 2008-02-25 04:17:40 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 Abhijith Das 2008-02-28 19:49:55 UTC

Created attachment 296251 [details]
patch to correct lock ordering in gfs2_adjust_quota

Comment 3 Abhijith Das 2008-02-28 19:51:45 UTC

using the patch in comment #2, unstuffing the quota inode doesn't crash the
machine. However, I'm seeing this withdraw message some time later. Any ideas?

GFS2: fsid=dm-3.0: fatal: assertion "tr->tr_num_revoke <= tr->tr_revokes" failed
GFS2: fsid=dm-3.0:   function = gfs2_trans_end, file = fs/gfs2/trans.c, line = 102
GFS2: fsid=dm-3.0: about to withdraw this file system
GFS2: fsid=dm-3.0: telling LM to withdraw
GFS2: fsid=dm-3.0: withdrawn
 [<e0d1cfec>] gfs2_lm_withdraw+0x73/0x7f [gfs2]
 [<e0d2dcc5>] gfs2_assert_withdraw_i+0x1e/0x30 [gfs2]
 [<e0d2dae0>] gfs2_trans_end+0xc1/0x129 [gfs2]
 [<e0d20c51>] gfs2_write_cache_jdata+0x27e/0x32f [gfs2]
 [<c04e21e0>] __next_cpu+0x12/0x21
 [<c041efff>] find_busiest_group+0x177/0x462
 [<e0d21291>] gfs2_jdata_writepages+0x1d/0x46 [gfs2]
 [<c045a0ef>] do_writepages+0x20/0x32
 [<c048d8a6>] __writeback_single_inode+0x170/0x2af
 [<c048dcbb>] sync_sb_inodes+0x170/0x213
 [<c048df0a>] writeback_inodes+0x6a/0xb0
 [<c045a52e>] wb_kupdate+0x7b/0xdb
 [<c045a94d>] pdflush+0x0/0x1a3
 [<c045aa58>] pdflush+0x10b/0x1a3
 [<c045a4b3>] wb_kupdate+0x0/0xdb
 [<c0435f05>] kthread+0xc0/0xeb
 [<c0435e45>] kthread+0x0/0xeb
 [<c0405c3b>] kernel_thread_helper+0x7/0x10
 =======================
 [<e0d2dccf>] gfs2_assert_withdraw_i+0x28/0x30 [gfs2]
 [<e0d2dae0>] gfs2_trans_end+0xc1/0x129 [gfs2]
 [<e0d20c51>] gfs2_write_cache_jdata+0x27e/0x32f [gfs2]
 [<c04e21e0>] __next_cpu+0x12/0x21
 [<c041efff>] find_busiest_group+0x177/0x462
 [<e0d21291>] gfs2_jdata_writepages+0x1d/0x46 [gfs2]
 [<c045a0ef>] do_writepages+0x20/0x32
 [<c048d8a6>] __writeback_single_inode+0x170/0x2af
 [<c048dcbb>] sync_sb_inodes+0x170/0x213
 [<c048df0a>] writeback_inodes+0x6a/0xb0
 [<c045a52e>] wb_kupdate+0x7b/0xdb
 [<c045a94d>] pdflush+0x0/0x1a3
 [<c045aa58>] pdflush+0x10b/0x1a3
 [<c045a4b3>] wb_kupdate+0x0/0xdb
 [<c0435f05>] kthread+0xc0/0xeb
 [<c0435e45>] kthread+0x0/0xeb
 [<c0405c3b>] kernel_thread_helper+0x7/0x10
 =======================
GFS2: fsid=dm-3.0: tr_num_revoke = 1, tr_revokes = 0 <4>GFS2: Transaction
created at: gfs2_write_cache_jdata+0x15e/0x32f [gfs2]

Comment 4 Steve Whitehouse 2008-03-03 08:57:21 UTC

Created attachment 296564 [details]
Patch to add correct number of revokes

Please try the following patch which fixes the number of revokes. Also I'd
suggest checking that the size of the quota inode is being updated in the
correct places since it would appear that the writepages code thought that the
inode had been truncated.

Comment 5 Abhijith Das 2008-03-08 21:23:16 UTC

Posted combined patch to rhkernel-list.
http://post-office.corp.redhat.com/archives/rhkernel-list/2008-March/msg00240.html

Comment 8 Don Zickus 2008-03-19 16:24:33 UTC

in kernel-2.6.18-86.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 11 errata-xmlrpc 2008-05-21 15:10:39 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.