Bug 237558 - GFS2: problem with drop_inode logic in memory pressure situations
GFS2: problem with drop_inode logic in memory pressure situations
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Christine Caulfield
Dean Jansa
:
: 243718 (view as bug list)
Depends On:
Blocks: 204760
  Show dependency treegraph
 
Reported: 2007-04-23 15:22 EDT by Josef Bacik
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2007-0959
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-07 14:47:44 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to add (3.62 KB, patch)
2007-06-06 09:41 EDT, Christine Caulfield
no flags Details | Diff

  None (edit)
Description Josef Bacik 2007-04-23 15:22:50 EDT
So I'm doing that screwy rm -rf on one node and du -h on the othernode test, 
and then I get a hang on both nodes.  The du -h node was under memory pressure 
(it only has like 256 megs of ram) and hung like this

du            D C57DE754   972  5455   4292                     (NOTLB)
       c57de768 00000096 00000002 c57de754 c57de750 00000000 c134122c 00000002 
       00000007 cace69b0 e5474f38 000009c6 00000713 cace6ad4 c1345de0 00000000 
       c526193c c134121c 00000296 009f0660 00000296 ffffffff 00000000 00000000 
Call Trace:
 [<d0acbffd>] holder_wait+0x8/0xc [gfs2]
 [<c061d40d>] __wait_on_bit+0x36/0x5d
 [<c061d48f>] out_of_line_wait_on_bit+0x5b/0x63
 [<d0acbff0>] wait_on_holder+0x41/0x46 [gfs2]
 [<d0acced8>] glock_wait_internal+0xf1/0x21e [gfs2]
 [<d0acd176>] gfs2_glock_nq+0x171/0x1a6 [gfs2]
 [<d0acd579>] gfs2_glock_nq_m+0x27/0x1a6 [gfs2]
 [<d0ac4a2c>] do_strip+0x19e/0x3b4 [gfs2]
 [<d0ac3817>] recursive_scan+0x108/0x193 [gfs2]
 [<d0ac396f>] trunc_dealloc+0xcd/0xea [gfs2]
 [<d0ac3998>] gfs2_file_dealloc+0xc/0xe [gfs2]
 [<d0ad9ebe>] gfs2_delete_inode+0xdd/0x154 [gfs2]
 [<c048c7a2>] generic_delete_inode+0xa6/0x110
 [<c048c81e>] generic_drop_inode+0x12/0x130
 [<d0ada065>] gfs2_drop_inode+0x33/0x35 [gfs2]
 [<c048be0a>] iput+0x63/0x66
 [<c048a03d>] dentry_iput+0x88/0xa2
 [<c048ade4>] prune_one_dentry+0x42/0x65
 [<c048afb8>] prune_dcache+0xf0/0x138
 [<c048b019>] shrink_dcache_memory+0x19/0x31
 [<c046467a>] shrink_slab+0xd5/0x138
 [<c0464eed>] try_to_free_pages+0x163/0x22a
 [<c0460e1f>] __alloc_pages+0x1e3/0x2e4
 [<d0ab0ca4>] dlm_lowcomms_get_buffer+0xe0/0x150 [dlm]
 [<d0aa934c>] _create_message+0x22/0x8a [dlm]
 [<d0aa9415>] create_message+0x61/0x68 [dlm]
 [<d0aab8ca>] _request_lock+0xfb/0x224 [dlm]
 [<d0aaba5a>] request_lock+0x67/0x86 [dlm]
 [<d0aad47f>] dlm_lock+0xcc/0x107 [dlm]
 [<d0b185ce>] gdlm_do_lock+0x9b/0x12f [lock_dlm]
 [<d0b18867>] gdlm_lock+0xf1/0xf9 [lock_dlm]
 [<d0ad01f8>] gfs2_lm_lock+0x30/0x3a [gfs2]
 [<d0acc8aa>] gfs2_glock_xmote_th+0xed/0x167 [gfs2]
 [<d0accbd3>] run_queue+0x2af/0x36d [gfs2]
 [<d0acd15e>] gfs2_glock_nq+0x159/0x1a6 [gfs2]
 [<d0acea69>] gfs2_inode_lookup+0x160/0x1b4 [gfs2]
 [<d0acebd5>] gfs2_lookupi+0x118/0x188 [gfs2]
 [<d0ad8e79>] gfs2_lookup+0x1d/0x4d [gfs2]
 [<c0481a54>] do_lookup+0xa0/0x13d
 [<c0483823>] __link_path_walk+0x81f/0xc7a
 [<c0483cc9>] link_path_walk+0x4b/0xc0
 [<c0483ff3>] do_path_lookup+0x191/0x1e2
 [<c04847ee>] __user_walk_fd+0x32/0x44
 [<c047e397>] vfs_lstat_fd+0x18/0x3e
 [<c047e3fb>] vfs_lstat+0x11/0x13
 [<c047e411>] sys_lstat64+0x14/0x28
 [<c0404eec>] syscall_call+0x7/0xb
 =======================

Now what bothers me is that we are doing an iput, which is just so we can drop 
the inode cache to free up some memory, but we are going down the deletion 
path, because if there is an nlink count and the glock is being demoted (which 
would be the case since the other node is likely trying to get ahold of the 
lock as well) then we clear the nlink count, which makes us try to delete the 
inode.  This isn't good, we just want to be freeing the inode cache, not 
deleting files.
Comment 1 Josef Bacik 2007-04-23 15:40:37 EDT
hmm well i'm an idiot, chances are the other node had removed it and this is 
doing what its supposed to, I'll try and figure out why we are hanging out.
Comment 2 Steve Whitehouse 2007-04-25 04:41:31 EDT
Well at the point of the hang, we should be holding both the inode's glock and
the inode's iopen glock in exclusive mode. Also we are then trying to get the
lock on an rgrp in order to deallocate some of the inode's blocks and this is
whats causing the hang at this point.

So it would seem that something on the other node is hanging onto one (or more)
of the rgrp locks and refusing to release for some reason.

The other interesting point is that the reason that the inode in question is
being disposed of in the first place is that we are short of memory. The
allocation in question being a DLM message being sent to request an otherwise
unrelated lock. I wonder if we need to change the DLM to use GFP_NOFS...
although I still can't see exactly what the other node might be doing at this
point in time that prevents it from granting the locks that we need.
Comment 3 Steve Whitehouse 2007-05-10 07:21:40 EDT
It would be interesting to know whether you still see this bug with the current
-nmw kernel, since it might be that the fix for bz #231910 has some bearing on this.
Comment 4 RHEL Product and Program Management 2007-05-10 07:44:51 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 5 Steve Whitehouse 2007-05-10 11:57:17 EDT
Needinfo status got dropped somehow.
Comment 6 Steve Whitehouse 2007-06-05 12:16:46 EDT
A further thought - this seems to be caused by a memory allocation in the DLM
code  causing GFS2 to try and push out inodes which then requires a lock which
is probably blocked against the memory allocation in the DLM.

So maybe there are some DLM allocations which need to be marked GFP_NOFS ?
Comment 7 Steve Whitehouse 2007-06-05 15:47:15 EDT
Dave & Patrick, please take a quick look at the stack trace in this bug and let
me know if you agree with me (comment #6) as to the cause. I'm assuming that
probably the easiest fix would be to use GFP_NOFS but its always possible that
it might be ok for the DLM to recurse like this... I suspect from the bug report
that its not.
Comment 8 Christine Caulfield 2007-06-06 06:41:55 EDT
It looks like the allocation that is passed into lowcomms_get_buffer is hard
coded to GFP_KERNEL - which is not really very handy when you have a filesystem
above it in the stack.

The allocation policy should probably be lockspace-specific as it was in RHEL4.
Comment 9 Christine Caulfield 2007-06-06 09:41:38 EDT
Created attachment 156348 [details]
Patch to add

We can quibble about the names (and maybe the use of flags) but I think this is
sort of what's needed
Comment 10 Steve Whitehouse 2007-06-06 09:46:19 EDT
Yes, that looks good. My only comment is that it would be nicer if we could
simply pass the allocation type to the DLM directly rather than inventing a new
flag for it.
Comment 11 Christine Caulfield 2007-06-06 09:52:21 EDT
Yeah, but that's an ABI change. Unless we add a new call to set it after creation.
Comment 12 David Teigland 2007-06-06 10:34:18 EDT
posted to rhkernel
http://post-office.corp.redhat.com/archives/rhkernel-list/2007-June/msg00482.html
Comment 13 RHEL Product and Program Management 2007-06-06 10:42:29 EDT
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.
Comment 14 David Teigland 2007-06-12 13:35:13 EDT
reposted to rhkernel
http://post-office.corp.redhat.com/archives/rhkernel-list/2007-June/msg01197.html
Comment 15 David Teigland 2007-06-12 13:39:01 EDT
*** Bug 243718 has been marked as a duplicate of this bug. ***
Comment 16 Don Zickus 2007-06-15 20:32:17 EDT
in 2.6.18-27.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 19 errata-xmlrpc 2007-11-07 14:47:44 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0959.html

Note You need to log in before you can comment on or make changes to this bug.