Description of problem: When the first node mounts a gfs fs, other nodes can't be allowed to mount until that first node has recovered all journals and called others_may_mount(). It's the job of lock_dlm to do this, but it's not -- it allows other nodes to mount while the first may still be recovering journals. This can potentially lead to fs corruption. Version-Release number of selected component (if applicable): How reproducible: often Steps to Reproduce: 1. all mounted nodes fail at once and are reset 2. all nodes come back and mount at once 3. the first mounter doesn't complete recovery of all journals before others also mount Actual results: Expected results: Additional info:
Email from Ken: there is a chance of corruption. The scenario I see is: 1) 3 machines are mounted. 2) They all fail at the same time. 3) Machine A comes back up and starts replay on the three journals serially. 4) Machine B comes back up, replays it's own journal really quickly while Machine A is still working on the first journal. 5) Machine B starts a workload and comes across blocks that are inconsistent because the third journal hasn't been replayed yet. Because all the machines died, there are not expired locks to protect the data. In order to hit the failure case, you always need at least three nodes to have been mounted at one time or another. But not all three nodes need to be running at the power failure time. (They key is that there must be a dirty journal beyond the first two to be mounted.)
Fixed on RHEL4 and STABLE branches. The likelihood of this bug causing a problem or corruption is even smaller than originally thought. Even if the lock module doesn't prevent other mounts until first recovery is done, there's a gfs lock the other mounters block on that has nearly the same effect.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-740.html