Description of problem: Break out of bugzilla 174576 (RHEL3 INRA customer issue) where GFS could hang (looping) due to out of memory errors. Totally 3 patches will be checked into CVS RHEL4 using this bugzilla. Version-Release number of selected component (if applicable): How reproducible: Never tried in house - however, the customer has a directory with rouglhy 500000 files in it. Steps to Reproduce: 1. 2. 3. Actual results: GFS hangs. Expected results: No hang. Additional info: All patches tested out by INRA.
Created attachment 124873 [details] gfs_malloc_leaf_free.patch Patch 3-1: Fixes directory delete out of memory error. Found in customer environment where gfs_inoded is deleting a max size of hash unit (0xffff entries). It hangs in leaf_free() during gmalloc while kmallocing 0xffff*sizeof(uint64_t) (=512K) of memroy. It did a kmalloc, zeroed out the buffer, then copied the zeroed contents into bh buffer and subsequently sent the bh into gfs_writei to write out to disk. This patch removes the unnecessary kmalloc plus the memory copy by directly zero out the bh buffer.
Created attachment 124875 [details] gfs_malloc_split.patch Patch 3-2: GFS was trying to split a full-grown directory (0xffff entries) hash leaf into two and subseqently hang. The buffer requirement 0xffff*sizeof(uint64_t)/2 = 262144 (256K) was too big for kmalloc to handle. Change them into vmalloc if kmalloc fails.
Created attachment 124876 [details] gfs_gmalloc_dump.patch patch 3-3: add dump_stack() into gmalloc so we could know the culprit whenever out of memory loop occurs.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0561.html