Red Hat Bugzilla – Bug 214239
GFS performance issue - trimming glock
Last modified: 2010-01-11 22:23:13 EST
Description of problem:
Support reports performance issues with a group of 15-nodes clusters -
each consists of 14 ftp servers and 1 rsync server. Clusters are located
in different geo locations that rsync to each other from time to time.
The GFS mounts served as FTP server for IPTV application. After rsync,
the system performance sinks - a "LIST" command could take 2 to 16 minutes.
Each cluster is serving 30T of files (largest directory contains 11000x2G
files) over FTP.
Based on oprofile data, the system seems to suffer serveral issues:
1. A known GFS problem with large amount of files within one directory
as described in:
2. DLM daemons hog CPUs and loops around __find_lock_by_id:
After rsync, the system seems to keep large amount of dlm locks that
would result dlm daemons, particulary dlm_recvd, to loop around
__find_lock_by_id(), consume (hog) CPU and memory.
3. Known linux rsync performance hits that would result large amount of
inode and dentry cache entries hanging that subsequently causes memory
Version-Release number of selected component (if applicable):
Steps to Reproduce:
A quick notes:
1) Both gfs and dlm lock search (and/or hash) implementation need to get
re-examined since they could easily consume ~50% of CPU cycles in a
[root@engcluster1 tmp]# ./opreport_module /dlm -l
samples % symbol name
200506 47.1299 search_hashchain
152213 35.7784 search_bucket
53249 12.5164 __find_lock_by_id
2934 0.6897 process_asts
1875 0.4407 dlm_hash
1411 0.3317 _release_rsb
2) We may need to manually purge dentry and inode cache as we did with
RHEL3's inode_purge tunable. This will cut down lock counts and
subsequently alleviate dlm workloads and reduce cache memory fragmentation.
for "df" performance - yes, it is a side effect. You could do
shell> gfs_tool settune <mount_point> statfs_slots 128
to boost "df" performance if it is a concern. The default statfs_slots is
64. Make it bigger would help. See the following bz for details:
djoo reported the test RPMs ran well and has been keeping the FTP "LIST"
command latency within the expected bound (5 seconds). Will work with kernel
folks to see what we want to do with the inode trimming patch.
Created attachment 146468 [details]
CVS check-in patch
We've decided to go for GFS-only solution and are still taking inputs
to finalize the GFS patch (without base kernel changes). Uploaded is a
(working) draft patch checked into CVS on Monday. Would like external
folks to try it out and provides input.
After the gfs.ko is loaded and filesystem mounted, issue the following
command to kick-off the trimming logic:
shell> gfs_tool settune <mount_point> glock_purge <percentage>
(e.g. "gfs_tool settune /mnt/gfs1 glock_purge 50")
This will tell GFS to trim roughly 50% of unused glocks every 5 seconds.
The default is 0 percent (no trimming). The operation can be dynamically
turned of by explicitly set the percentage to "0".
Created attachment 146474 [details]
Glock trimming description
This write-up documents the technical implementation of the above patch.
One of the tunable mentioned in the document can be issued as:
shell> gfs_tool settune <mount_point> demote_secs <seconds>
(e.g. "gfs_tool settune /mnt/gfs1 demote_secs 200")
This will demote gfs write locks into less restricted states and
subsequently flush the cache data into disk. Shorter demote second(s)
is used to avoid gfs accumulating too much cached data that results with
burst mode flushing activities or prolong another nodes' lock access.
It is default to 300 seconds. This command can be issued dynamically but
has to be done after mount time.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.