Bug 1426324
| Summary: | common-ha: setup after teardown often fails | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Kaleb KEITHLEY <kkeithle> |
| Component: | common-ha | Assignee: | Kaleb KEITHLEY <kkeithle> |
| Status: | CLOSED ERRATA | QA Contact: | surabhi <sbhaloth> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.2 | CC: | amukherj, asoman, bturner, bugs, jthottan, rcyriac, rhinduja, rhs-bugs, skoduri, storage-qa-internal |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | RHGS 3.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.8.4-16 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1426323 | Environment: | |
| Last Closed: | 2017-03-23 06:05:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1426323 | ||
| Bug Blocks: | 1351528 | ||
|
Description
Kaleb KEITHLEY
2017-02-23 17:03:00 UTC
Note: we would only take the change - pcs cluster destroy + pcs cluster destroy --all Proposing this as a blocker post discussion with Kaleb. I understand the fix may be the cleaner approach but would like to understand if it is blocker for this release at this point. We do run --cleanup internally as part of "gluster nfs-ganesha disable" command which does teardown. The only issue was if the setup fails or comes in inconsistent state, admin needs to manually run --teardown & --cleanup prior to re-setup which is being documented as part of bug1399122 in Troubleshooting seection. Are those steps not enough? It appears in testing that the loop at line 326 is not always run, thus the ... pcs cluster stop $server --force pcs cluster node remove $server ... aren't being performed. As a result, they remain members of the cluster, and since cleanup only runs on $this node, the other nodes' membership in the cluster is "remembered" the next time a cluster is created. Tom Jelinek, the pcs/pcsd developer says shutting down by stoping+removing one node at a time is problematic, e.g. quorum state could cause (unspecified) issues. He says the `pcs cluster destroy -all` is a better implementation. upstream patch : https://review.gluster.org/#/c/16737/ downstream patch : https://code.engineering.redhat.com/gerrit/#/c/98648/ Tried teardown with gluster nfs-ganesha disable and then enabling again on both RHEL6 and RHEL7 , don't see any issue with bringing up the cluster again. With this fix there is no need to do manual cleanup after teardown and before enabling ganesha. Marking the BZ verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |