Description of problem: If gfs_tool freeze is run multiple times from the same node against the same GFS filesystem, gfs_tool unfreeze needs to be run the same number of times to unfreeze the filesystem. Is this intentional? If it is intentional, is there a way to determine how many times a particular filesystem has been frozen, and from which nodes? Version-Release number of selected component (if applicable): GFS-6.1.6-1.i386.rpm
It appears to be intentional, although I am not sure why. When a freeze occurs, a counter in the in-core superblock is incremented. When the unfreeze occurs, it is decremented. Simple, but this does mean that for every freeze you do you must call an unfreeze. As I said, I am unsure why we do this. It would be easy to fix, but I first need to determine if there is a valid reason for it working the way it does now. As for a way to find out how many times a filesystem has been frozen ... there does not appear to (currently) be a way to do this. It would be nice to be able to 'cat /proc/fs/gfs' to get this info. I am looking into the possibility of adding this interface.
If you do determine a valid reason for freeze working the way it does now, please let me know that as well. (Then we can document it in the GLS course to help handle the customer/student question of why freeze works like this.) :)
This is by design. The reason is as follows: Say you have two processes, A and B. Process A freezes a filesystem and continues processing, perhaps doing a backup of the filesystem. Before A completes, process B decides to freeze the same filesystem. (Note: process A and B are on the same node, of course.) Process B continues to do whatever it does, and process A finishes and does an unfreeze. Note that B still expects that the filesystem is frozen. If freeze/unfreeze was designed *not* to increment/decrement a freeze counter, the unfreeze done by A will prematurely unfreeze the filesystem (since B still expects that it is frozen). I've added code to include the value of the sd_freeze_count when displaying the filesystem counters. This can be seen by doing a 'gfs_tool counters <mountpoint>' command. Its listed as "freeze count". Note that because the value of these counters is retrieved via an ioctl command, this is not possible in GFS2, since most/all <?> of the ioctl code was removed. I added the change for GFS(1) for RHEL4 and RHEL5.
That rationale sounds quite reasonable. It seems to me that what's left here is just to figure out if there's some way to make the GFS2 freeze counter information available in case some sysadmin is attempting to debug an issue where a process fails to issue unfreeze properly.
Yes. If we want to do something similar for GFS2 we will probably have to read it via sysfs. I can see where getting to value of this counter would be quite useful.
It's an open question whether or not freeze/thaw bdev functions should maintain such a counter: currently they don't. Can we avoid multiple counters? How many entry points are there for a freeze request?