Using GFS on a 4 node cluster. In a regular base we run into the problem where a large amount of the available storage is used in meta data. For example on this partition /dev/pool/gpo_worko 17G 7.0G 9.3G 43% /worko And a 100MB file could not be written. The problem goes away as soon as 'gfs_tool reclaim' is issued. There is not a single usage pattern, it varies from 100000s of small files, which are written and deleted a couple times a second, up to a 1TB file, that is written, and sometimes deleted. Questions are: - is the behaviour supposed to be like this? - if it is, is it planned to change it? - is it recommended to run "gfs_tool reclaim" every couple of hours through cron?
gfs grabs free blocks as needed and makes them into meta-data blocks. Once a block is marked as meta-data, it remains as meta-data even if it isn't being used. gfs_tool reclaim can be used to convert these back. Doing so locks the entire fs while the reclaim takes place. The reason for keeping meta-data blocks around like this is a bit of a kludge to deal with the way nfs server is implemented. From the above, the best responses are (one of these should be enough.) - Change the usage pattern - If the work load is from some kind of batch system, add gfs_tool reclaim to part of the batch job. - create two seperate filesystems, one for the tiny files, one for the larger. - create a single fs that is large enough that it can have all of the unused meta-data and large files.