1336947 – [NFS-Ganesha] : stonith-enabled option not set with new versions of cman,pacemaker,corosync and pcs

Bug 1336947 - [NFS-Ganesha] : stonith-enabled option not set with new versions of cman,pacemaker,corosync and pcs

Summary: [NFS-Ganesha] : stonith-enabled option not set with new versions of cman,pace...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	common-ha
Sub Component:
Version:	3.8.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kaleb KEITHLEY
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1334092 1336945
Blocks:	1336948
TreeView+	depends on / blocked

Reported:	2016-05-17 21:17 UTC by Kaleb KEITHLEY
Modified:	2016-06-16 14:06 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-3.8rc2
Clone Of:	1336945
Clones:	1336948 (view as bug list)
Environment:
Last Closed:	2016-06-16 14:06:58 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Comment 1 Vijay Bellur 2016-05-17 21:39:50 UTC

REVIEW: http://review.gluster.org/14405 (common-ha: stonith-enabled option set error in new pacemaker) posted (#1) for review on release-3.8 by Kaleb KEITHLEY (kkeithle)

Comment 2 Vijay Bellur 2016-05-19 10:17:23 UTC

COMMIT: http://review.gluster.org/14405 committed in release-3.8 by Kaleb KEITHLEY (kkeithle) 
------
commit 8603b0ae4e82d5d8f3ad129c490a0c717aaadcb4
Author: Kaleb S KEITHLEY <kkeithle>
Date:   Tue May 17 17:35:02 2016 -0400

    common-ha: stonith-enabled option set error in new pacemaker
    
    Setting the option too early results in an error in newer versions
    of pacemaker. Postpone setting the option in order for it to succeed.
    
    N.B. We do not use a fencing agent. Yes, we know this is "not supported."
    
    Backport from mainline
    > http://review.gluster.org/#/c/14404/
    > BUG: 1336945
    > Change-Id: I86953fdd67e6736294dbd2d0795611837188bd9d
    
    Change-Id: I402992bcb90a92dbcc915a75fe03b25221625e98
    BUG: 1336947
    Signed-off-by: Kaleb S KEITHLEY <kkeithle>
    Reviewed-on: http://review.gluster.org/14405
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>

Comment 3 Ken Gaillot 2016-05-19 14:50:42 UTC

Your "too early" comment finally rang a bell: What you're seeing is an unfortunately necessary side effect of a fix for an issue in 6.7. The situation was discussed in BZ#1320740.

The bottom line is, yes, you have to wait a short while after startup before querying or writing to the CIB in RHEL 6.8, so the cluster has time to elect a DC. The easiest way is just to loop if the option change fails, or to loop until "crmadmin --dc_lookup --timeout=5000" exits 0 before doing the option change.

Comment 4 Vijay Bellur 2016-05-19 17:07:02 UTC

REVIEW: http://review.gluster.org/14427 (common-ha: wait for cluster to elect DC before accessing CIB) posted (#1) for review on release-3.8 by Kaleb KEITHLEY (kkeithle)

Comment 5 Vijay Bellur 2016-05-24 09:36:21 UTC

COMMIT: http://review.gluster.org/14427 committed in release-3.8 by Kaleb KEITHLEY (kkeithle) 
------
commit cfb0ef35483eb0656ca190c1665e68c5ff448092
Author: Kaleb S KEITHLEY <kkeithle>
Date:   Thu May 19 13:04:50 2016 -0400

    common-ha: wait for cluster to elect DC before accessing CIB
    
    access attempts, e.g. `pcs property set stonith-enabled=false`
    will fail (or time out) if attempted "too early", i.e. before
    the cluster has elected its DC.
    
    see https://bugzilla.redhat.com/show_bug.cgi?id=1336947#c3 and
    https://bugzilla.redhat.com/show_bug.cgi?id=1320740
    
    Change-Id: Ifc0aa7ce652c1da339b9eb8fe17e40e8a09b1096
    BUG: 1336947
    Signed-off-by: Kaleb S KEITHLEY <kkeithle>
    Reviewed-on: http://review.gluster.org/14427
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: soumya k <skoduri>
    Reviewed-by: jiffin tony Thottan <jthottan>

Comment 6 Niels de Vos 2016-06-16 14:06:58 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.