Bug 162725
Summary: | Full gfs access allowed to node in inquorate state | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Henry Harris <henry.harris> |
Component: | dlm | Assignee: | David Teigland <teigland> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | axel.thimm, cluster-maint, kanderso, lhh |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-07-13 22:25:14 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Henry Harris
2005-07-08 01:39:18 UTC
More information: Apparently, this happens (and is more easily reproducible?) if you shut two of the nodes down cleanly. That is correct, Lon. I used the shutdown command. If I pull the power cord on one node, the remaining two nodes cannot access GFS until a fence_ack_manual command is issued. I will try pulling the power cord on two nodes so that the remaining node is inquorate and see what happens. This looks correct. GFS activity is only blocked if a node that has the fs mounted fails. This can result in the somewhat unexpected situation seen here where the cluster loses quorum but fs's mounted by nodes that haven't failed continue running normally. If there was one fs mounted by the three nodes, and two failed, then that fs would indeed be "blocked" until quorum was regained. [GFS activity is never fully blocked, though, even in this case, as gfs will continue to use locks it already holds -- only new lock requests are blocked.] If you get a split-brain cluster, gfs being blocked or not doesn't matter. The quorate cluster partition will always fence the inquorate partition. This guarantees that no inquorate nodes will modify any fs's after fencing, so enabling gfs on the quorate nodes can happen. Dave - thanks for the explanation. That helps a lot. I will change the status to NOTABUG. *** Bug 170926 has been marked as a duplicate of this bug. *** There is a scenario where this behaviour may be harmful. Consider a node having mounted GFS and kicking off other nodes due to a partially malfunctioning network connection (no fencing, just CMAN). Sep 29 12:06:42 zs01 kernel: CMAN: removing node zs03 from the cluster : Missed too many heartbeats Sep 29 12:06:43 zs03 kernel: CMAN: Being told to leave the cluster by node 1 (Taken from bug 169693) At the very end this node has removed the others and keeps the filesystem mounted in an inquorate state. Now the other nodes form a new cluster and remount GFS, while the sick node has GFS mnounted in another cluster partition. I believe this is what happened in the quoted bug. The true cause of failure seems to be the broken cluster network connectivity, but GFS should have a safegueard against that. Would it make sense to add an option to totally block the filesystem when quorum is lost? After hitting the mentioned bug it would certainly make me sleep better again. After all, when a cluster has lost its quorum, there is some havoc going on, and GFS trying to save whatever is left to save doesn't seem right. a more specific example: 1. nodes in cluster: A,B,C 2. A has GFS file system "foo" mounted 3. B,C have no GFS file systems mounted 4. A has a failed network connection 5. cluster partitions 6. A remains in original, now inquorate, cluster 7. A continues to use GFS foo, uneffected 8. B,C form a new, quorate cluster 9. B,C start fenced prior to mounting GFS foo 10. A is fenced by B or C 11. B,C can mount GFS foo To avoid the problems in steps 4-5 you could use bonding with two network connections (although there's some doubt about how well this works at the moment.) There could be problems that cause CMAN to see a failed network connection when in fact the connection is healthy. To work around these you can play with the CMAN tunables (hello_timer, deadnode_timeout, max_retries) or possibly run network intesive applications over a separate network. I'm having difficulty connecting the last paragraph of comment 6 to the situation I've described: - What does "blocking" a GFS file system mean specifically in this context and what would it accomplish? - Losing quorum should be considered a perfectly normal event in the course of running a cluster, it doesn't imply any kind of bug or malfunction -- it's just an operating mode in which there's a greater limit on the actions the cluster may perform. For one I'm looking for explanations/fixes to bug 169693. And if this bug is related, then if node A's activities on the file system had been blocked, the filesystem wouldn't had diverged into two different views. Also the scenario above assumes that in step A you have fenced configured w/o clean_start. If fencing non-participating nodes at start-up is a manifest part of the cluster integrity concept, then that degree of freedom should be removed from the user. Losing quorum should not be considered a perfectly normal event. By definition the cluster does not exist anymore, either through hardware/software fault or human misconfiguration. If a cluster needs to be degraded in its number of members there are mechanisms to do so w/o dropping into inquorate state. Just to mention bug 169693 again: If the cluster concept would take the inquorate state more seriously and would block activities immediately and not rely on being fenced in the future, the split view of GFS would not had happened. Even if the true cause for the GFS corruption in bug 169693 might turn out to be bug 164331, a blocking upon entering inquorate state would had saved the corruption. I see this kind of safeguard as an important asset for an enterprise grade filesystem, or not? in comment #8: "in step A" -> "in step 10" clean_start should never be set to 1 in cluster.conf, it can lead to fs corruption. References to it will probably be removed. When a new cluster gains quorum and the fence domain is first activated, nodes in an unknown state (not in the quorate cluster) must be fenced because we don't know if they are in a cluster partition of their own using the fs. If they are and they are not fenced, the fs can easily be corrupted. So, if nodes on both sides of a partitioned cluster ever have the same fs mounted at once, then the fencing system was not used correctly (e.g. clean_start was 1) or there's a bug somewhere in the fencing system. I suspect that this clean_start=1 is the root cause of your problems. While gfs is mounted on a node (whether the cluster is quorate or not) there is no way for us to "block the file system" as you suggest. Quorum and fencing solve two different problems. Quorum adds "sanity" to a split-brain scenario by allowing one partition to go ahead and do work. Fencing forcibly prevents nodes that are not participating in the quorate cluster and could potentially still be writing to the fs from doing so. If we could be certain that a node in an inquorate cluster (or a node that's been removed from the cluster) would not write to the fs, then we wouldn't need to fence it. And there are improvements that we could add along those lines. In our example above, some new userland daemon on node A could recognize that it's the only remaining node in an inquorate cluster. Based on this knowledge it could decide to do something akin to 'gfs_tool withdraw' to forcibly shut down local access to the fs and return errors to any processes using the fs. If that was successful, it could then record somewhere (probably on shared storage) that it has safely and completely shutdown/unmounted the fs. When B,C start up they could safely bypass fencing A if they saw this record that A had in fact safely shut down fs access. That kind of feature would be really neat to work on given the time... |