Bug 434736

Summary: GFS2: gfs2_adjust_quota has broken unstuffing code
Product: Red Hat Enterprise Linux 5 Reporter: Steve Whitehouse <swhiteho>
Component: kernelAssignee: Don Zickus <dzickus>
Status: CLOSED ERRATA QA Contact: GFS Bugs <gfs-bugs>
Severity: low Docs Contact:
Priority: high    
Version: 5.2CC: edamato, lwang, rpeterso, swhiteho
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0314 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 15:10:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 307091, 435075    
Attachments:
Description Flags
patch to correct lock ordering in gfs2_adjust_quota
none
Patch to add correct number of revokes none

Description Steve Whitehouse 2008-02-25 04:03:19 UTC
Description of problem:


If the unstuffing code in gfs2_adjust_quota ever runs it will likely cause a
crash due to the incorrect locking order and also the fact that calling
gfs2_alloc_get recursively isn't allowed. Also gfs2_adjust_quota gets called
under a transaction and thus gfs2_inplace_reserve must not be called (since it
locks rgrps, and the rgrps must be locked before the transaction is started).

The simple solution is to just add a block to the reservation at the higher
layer. This can be done unconditionally, even if an unstuff isn't needed since
it will be released back to the rgrp if its not allocated during the transaction.

Fixing this is required by the next step of the tree walking bz.

Comment 1 RHEL Program Management 2008-02-25 04:17:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 Abhijith Das 2008-02-28 19:49:55 UTC
Created attachment 296251 [details]
patch to correct lock ordering in gfs2_adjust_quota

Comment 3 Abhijith Das 2008-02-28 19:51:45 UTC
using the patch in comment #2, unstuffing the quota inode doesn't crash the
machine. However, I'm seeing this withdraw message some time later. Any ideas?

GFS2: fsid=dm-3.0: fatal: assertion "tr->tr_num_revoke <= tr->tr_revokes" failed
GFS2: fsid=dm-3.0:   function = gfs2_trans_end, file = fs/gfs2/trans.c, line = 102
GFS2: fsid=dm-3.0: about to withdraw this file system
GFS2: fsid=dm-3.0: telling LM to withdraw
GFS2: fsid=dm-3.0: withdrawn
 [<e0d1cfec>] gfs2_lm_withdraw+0x73/0x7f [gfs2]
 [<e0d2dcc5>] gfs2_assert_withdraw_i+0x1e/0x30 [gfs2]
 [<e0d2dae0>] gfs2_trans_end+0xc1/0x129 [gfs2]
 [<e0d20c51>] gfs2_write_cache_jdata+0x27e/0x32f [gfs2]
 [<c04e21e0>] __next_cpu+0x12/0x21
 [<c041efff>] find_busiest_group+0x177/0x462
 [<e0d21291>] gfs2_jdata_writepages+0x1d/0x46 [gfs2]
 [<c045a0ef>] do_writepages+0x20/0x32
 [<c048d8a6>] __writeback_single_inode+0x170/0x2af
 [<c048dcbb>] sync_sb_inodes+0x170/0x213
 [<c048df0a>] writeback_inodes+0x6a/0xb0
 [<c045a52e>] wb_kupdate+0x7b/0xdb
 [<c045a94d>] pdflush+0x0/0x1a3
 [<c045aa58>] pdflush+0x10b/0x1a3
 [<c045a4b3>] wb_kupdate+0x0/0xdb
 [<c0435f05>] kthread+0xc0/0xeb
 [<c0435e45>] kthread+0x0/0xeb
 [<c0405c3b>] kernel_thread_helper+0x7/0x10
 =======================
 [<e0d2dccf>] gfs2_assert_withdraw_i+0x28/0x30 [gfs2]
 [<e0d2dae0>] gfs2_trans_end+0xc1/0x129 [gfs2]
 [<e0d20c51>] gfs2_write_cache_jdata+0x27e/0x32f [gfs2]
 [<c04e21e0>] __next_cpu+0x12/0x21
 [<c041efff>] find_busiest_group+0x177/0x462
 [<e0d21291>] gfs2_jdata_writepages+0x1d/0x46 [gfs2]
 [<c045a0ef>] do_writepages+0x20/0x32
 [<c048d8a6>] __writeback_single_inode+0x170/0x2af
 [<c048dcbb>] sync_sb_inodes+0x170/0x213
 [<c048df0a>] writeback_inodes+0x6a/0xb0
 [<c045a52e>] wb_kupdate+0x7b/0xdb
 [<c045a94d>] pdflush+0x0/0x1a3
 [<c045aa58>] pdflush+0x10b/0x1a3
 [<c045a4b3>] wb_kupdate+0x0/0xdb
 [<c0435f05>] kthread+0xc0/0xeb
 [<c0435e45>] kthread+0x0/0xeb
 [<c0405c3b>] kernel_thread_helper+0x7/0x10
 =======================
GFS2: fsid=dm-3.0: tr_num_revoke = 1, tr_revokes = 0 <4>GFS2: Transaction
created at: gfs2_write_cache_jdata+0x15e/0x32f [gfs2]


Comment 4 Steve Whitehouse 2008-03-03 08:57:21 UTC
Created attachment 296564 [details]
Patch to add correct number of revokes

Please try the following patch which fixes the number of revokes. Also I'd
suggest checking that the size of the quota inode is being updated in the
correct places since it would appear that the writepages code thought that the
inode had been truncated.

Comment 5 Abhijith Das 2008-03-08 21:23:16 UTC
Posted combined patch to rhkernel-list.
http://post-office.corp.redhat.com/archives/rhkernel-list/2008-March/msg00240.html

Comment 8 Don Zickus 2008-03-19 16:24:33 UTC
in kernel-2.6.18-86.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 11 errata-xmlrpc 2008-05-21 15:10:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html