Bug 182057 - GFS out of memory hang due to large count of files in directory
Summary: GFS out of memory hang due to large count of files in directory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: gfs
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Wendy Cheng
QA Contact: GFS Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-02-19 21:56 UTC by Wendy Cheng
Modified: 2010-01-12 03:09 UTC (History)
0 users

Fixed In Version: RHBA-2006-0561
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-08-10 21:34:57 UTC
Embargoed:


Attachments (Terms of Use)
gfs_malloc_leaf_free.patch (3.09 KB, patch)
2006-02-20 01:15 UTC, Wendy Cheng
no flags Details | Diff
gfs_malloc_split.patch (3.53 KB, patch)
2006-02-20 01:16 UTC, Wendy Cheng
no flags Details | Diff
gfs_gmalloc_dump.patch (303 bytes, patch)
2006-02-20 01:18 UTC, Wendy Cheng
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0561 0 normal SHIPPED_LIVE GFS-kernel bug fix update 2006-08-10 04:00:00 UTC

Description Wendy Cheng 2006-02-19 21:56:06 UTC
Description of problem:
Break out of bugzilla 174576 (RHEL3 INRA customer issue) where GFS could hang 
(looping) due to out of memory errors. Totally 3 patches will be checked into 
CVS RHEL4 using this bugzilla. 

Version-Release number of selected component (if applicable):


How reproducible:
Never tried in house - however, the customer has a directory with rouglhy 
500000 files in it. 

Steps to Reproduce:
1.
2.
3.
  
Actual results:
GFS hangs.

Expected results:
No hang.

Additional info:
All patches tested out by INRA.

Comment 1 Wendy Cheng 2006-02-20 01:15:15 UTC
Created attachment 124873 [details]
gfs_malloc_leaf_free.patch

Patch 3-1:
Fixes directory delete out of memory error. Found in customer environment
where gfs_inoded is deleting a max size of hash unit (0xffff entries). It
hangs in leaf_free() during gmalloc while kmallocing 0xffff*sizeof(uint64_t)
(=512K) of memroy. It did a kmalloc, zeroed out the buffer, then copied the
zeroed contents into bh buffer and subsequently sent the bh into gfs_writei
to write out to disk. This patch removes the unnecessary kmalloc plus the
memory copy by directly zero out the bh buffer.

Comment 2 Wendy Cheng 2006-02-20 01:16:49 UTC
Created attachment 124875 [details]
gfs_malloc_split.patch

Patch 3-2:
GFS was trying to split a full-grown directory (0xffff entries) hash leaf
into two and subseqently hang. The buffer requirement 0xffff*sizeof(uint64_t)/2

= 262144 (256K) was too big for kmalloc to handle. Change them into vmalloc if
kmalloc fails.

Comment 3 Wendy Cheng 2006-02-20 01:18:36 UTC
Created attachment 124876 [details]
gfs_gmalloc_dump.patch

patch 3-3: add dump_stack() into gmalloc so we could know the culprit whenever
out of memory loop occurs.

Comment 6 Red Hat Bugzilla 2006-08-10 21:34:58 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0561.html



Note You need to log in before you can comment on or make changes to this bug.