Bug 1425112 - [Ganesha] : Unable to bring up a Ganesha HA cluster on RHEL 6.9.
Summary: [Ganesha] : Unable to bring up a Ganesha HA cluster on RHEL 6.9.
Alias: None
Product: GlusterFS
Classification: Community
Component: common-ha
Version: 3.8
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: Kaleb KEITHLEY
QA Contact:
Depends On: 1424944 1425110
TreeView+ depends on / blocked
Reported: 2017-02-20 15:13 UTC by Kaleb KEITHLEY
Modified: 2017-03-18 10:52 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.8.10
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1425110
Last Closed: 2017-03-14 11:10:23 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Comment 1 Kaleb KEITHLEY 2017-02-20 15:16:38 UTC
CLI Output

+ pcs property set stonith-enabled=false
Error: unable to get cib
Error: unable to get cib
+ sleep 4
+ pcs cluster start --all
gqas014: Unable to connect to gqas014.sbu.lab.eng.bos.redhat.com (Connection error)
gqas015: Unable to connect to gqas015.sbu.lab.eng.bos.redhat.com (Connection error)
gqas009: Unable to connect to gqas009.sbu.lab.eng.bos.redhat.com (Connection error)
gqas010: Unable to connect to gqas010.sbu.lab.eng.bos.redhat.com (Connection error)
Error: unable to start all nodes

cluster devels say this is the result of new async behavior of the `pcs cluster setup ...` command.

SSL auth certs have to be deployed before the cluster will accept connections.

They suggest a delay of approx 12 seconds between the `pcs cluster setup ...` and `pcs cluster start --all`

Comment 2 Worker Ant 2017-02-20 16:19:44 UTC
REVIEW: https://review.gluster.org/16691 (common-ha: unable to start HA, Connection Error) posted (#1) for review on release-3.8 by Kaleb KEITHLEY (kkeithle@redhat.com)

Comment 3 Worker Ant 2017-02-26 19:14:59 UTC
COMMIT: https://review.gluster.org/16691 committed in release-3.8 by Kaleb KEITHLEY (kkeithle@redhat.com) 
commit 5d499cc221850fb1f83b625df5a113e0b83d0a99
Author: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Date:   Mon Feb 20 11:14:53 2017 -0500

    common-ha: unable to start HA, Connection Error
    See BZ 1284404. pcsd behavior has changed and pcsd will not accept
    connections until SSL certificates have fully propagated throughout
    all the nodes
    HA devels suggest a 12 second delay between the `pcs cluster setup ...`
    and the `pcs cluster start --all`
    release-3.9 BZ: 1425110
    release-3.9 change: https://review.gluster.org/16690
    Change-Id: If94b6991a62f346dbead023c7e7f8282a995728c
    BUG: 1425112
    Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
    Reviewed-on: https://review.gluster.org/16691
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>

Comment 4 Niels de Vos 2017-03-18 10:52:28 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.10, please open a new bug report.

glusterfs-3.8.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-March/000068.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.