Bug 1441055

Summary: pacemaker service is disabled after creating nfs-ganesha cluster.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Arthy Loganathan <aloganat>
Component: common-haAssignee: Kaleb KEITHLEY <kkeithle>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, dang, ffilz, jthottan, kkeithle, mbenjamin, rhs-bugs, skoduri, storage-qa-internal
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-29 Doc Type: Bug Fix
Doc Text:
During NFS-Ganesha cluster setup, the pcsd service first destroyed any existing cluster, which disabled the pacemaker service. This meant that the pacemaker service did not start automatically after a reboot. The pacemaker service is now explicitly re-enabled after successful NFS-Ganesha cluster setup, and the pacemaker service is started automatically after node reboot.
Story Points: ---
Clone Of:
: 1452614 1641673 (view as bug list) Environment:
Last Closed: 2017-09-21 04:37:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1417151, 1452614    

Description Arthy Loganathan 2017-04-11 06:09:31 UTC
Description of problem:
pacemaker service is disabled after enabling nfs-ganesha cluster even if pacemaker service is enabled before creating nfs-ganesha cluster.

Version-Release number of selected component (if applicable):
glusterfs-ganesha-3.8.4-18.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Enable pacemaker service on all the ganesha nodes.
   systemctl enable pacemaker.service
2. Enable nfs-ganesha.
   gluster nfs-ganesha enable
3. Check pcs status.

Actual results:
pacemaker service is in disabled state.

Expected results:
pacemaker service should be enabled.

Additional info:

[root@dhcp46-42 ~]# systemctl enable pacemaker.service
Created symlink from /etc/systemd/system/multi-user.target.wants/pacemaker.service to /usr/lib/systemd/system/pacemaker.service.
[root@dhcp46-42 ~]# systemctl status pacemaker.service
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:pacemakerd
           http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html

Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com pacemakerd[20025]:   notice: Stopping stonith-ng
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com stonith-ng[20030]:   notice: Caught 'Terminated' signal
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com pacemakerd[20025]:   notice: Stopping cib
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com cib[20029]:   notice: Caught 'Terminated' signal
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com cib[20029]:   notice: Node dhcp46-101.lab.eng.blr.redhat.com state is now lost
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com cib[20029]:   notice: Purged 1 peers with id=2 and/or uname=dhcp46-101.lab.eng.blr.redhat.com from the membership cache
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com cib[20029]:   notice: Disconnected from Corosync
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com cib[20029]:   notice: Disconnected from Corosync
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com pacemakerd[20025]:   notice: Shutdown complete
Apr 11 11:31:47 dhcp46-42.lab.eng.blr.redhat.com systemd[1]: Stopped Pacemaker High Availability Cluster Manager.
[root@dhcp46-42 ~]# 
[root@dhcp46-42 ~]# gluster nfs-ganesha enable
Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted pool. Do you still want to continue?
 (y/n) y
This will take a few minutes to complete. Please wait ..
nfs-ganesha : success 
[root@dhcp46-42 ~]# systemctl status pacemaker.service
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-04-11 11:34:07 IST; 1min 42s ago
     Docs: man:pacemakerd
           http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html
 Main PID: 29070 (pacemakerd)
   CGroup: /system.slice/pacemaker.service
           ├─29070 /usr/sbin/pacemakerd -f
           ├─29072 /usr/libexec/pacemaker/cib
           ├─29073 /usr/libexec/pacemaker/stonithd
           ├─29074 /usr/libexec/pacemaker/lrmd
           ├─29076 /usr/libexec/pacemaker/attrd
           ├─29077 /usr/libexec/pacemaker/pengine
           └─29078 /usr/libexec/pacemaker/crmd

Apr 11 11:35:13 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:30946:stderr [ 0 bytes (0 B) copied, 0.0030081...0.0 kB/s ]
Apr 11 11:35:23 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31091:stderr [ 0+0 records in ]
Apr 11 11:35:23 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31091:stderr [ 0+0 records out ]
Apr 11 11:35:23 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31091:stderr [ 0 bytes (0 B) copied, 0.0025322...0.0 kB/s ]
Apr 11 11:35:33 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31191:stderr [ 0+0 records in ]
Apr 11 11:35:33 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31191:stderr [ 0+0 records out ]
Apr 11 11:35:33 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31191:stderr [ 0 bytes (0 B) copied, 0.0044053...0.0 kB/s ]
Apr 11 11:35:43 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31336:stderr [ 0+0 records in ]
Apr 11 11:35:43 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31336:stderr [ 0+0 records out ]
Apr 11 11:35:43 dhcp46-42.lab.eng.blr.redhat.com lrmd[29074]:   notice: dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock_monitor_10000:31336:stderr [ 0 bytes (0 B) copied, 0.0032410...0.0 kB/s ]
Hint: Some lines were ellipsized, use -l to show in full.

[root@dhcp46-42 ~]# pcs status
Cluster name: ganesha-ha-360
Stack: corosync
Current DC: dhcp46-42.lab.eng.blr.redhat.com (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Tue Apr 11 11:38:35 2017		Last change: Tue Apr 11 11:34:51 2017 by root via cibadmin on dhcp46-42.lab.eng.blr.redhat.com

4 nodes and 24 resources configured

Online: [ dhcp46-101.lab.eng.blr.redhat.com dhcp46-42.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com dhcp47-167.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp46-101.lab.eng.blr.redhat.com dhcp46-42.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com dhcp47-167.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp46-101.lab.eng.blr.redhat.com dhcp46-42.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com dhcp47-167.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp46-101.lab.eng.blr.redhat.com dhcp46-42.lab.eng.blr.redhat.com dhcp47-155.lab.eng.blr.redhat.com dhcp47-167.lab.eng.blr.redhat.com ]
 Resource Group: dhcp46-42.lab.eng.blr.redhat.com-group
     dhcp46-42.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-42.lab.eng.blr.redhat.com
     dhcp46-42.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-42.lab.eng.blr.redhat.com
     dhcp46-42.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-42.lab.eng.blr.redhat.com
 Resource Group: dhcp46-101.lab.eng.blr.redhat.com-group
     dhcp46-101.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-101.lab.eng.blr.redhat.com
     dhcp46-101.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-101.lab.eng.blr.redhat.com
     dhcp46-101.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-101.lab.eng.blr.redhat.com
 Resource Group: dhcp47-155.lab.eng.blr.redhat.com-group
     dhcp47-155.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-155.lab.eng.blr.redhat.com
     dhcp47-155.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-155.lab.eng.blr.redhat.com
     dhcp47-155.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-155.lab.eng.blr.redhat.com
 Resource Group: dhcp47-167.lab.eng.blr.redhat.com-group
     dhcp47-167.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-167.lab.eng.blr.redhat.com
     dhcp47-167.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-167.lab.eng.blr.redhat.com
     dhcp47-167.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-167.lab.eng.blr.redhat.com

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Comment 4 Atin Mukherjee 2017-06-09 12:52:17 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/108431/

Comment 6 Atin Mukherjee 2017-06-14 10:04:47 UTC
One more downstream only patch https://code.engineering.redhat.com/gerrit/#/c/109017/ is needed to fix the regression caused by https://code.engineering.redhat.com/gerrit/#/c/108431/

Comment 8 Manisha Saini 2017-07-17 11:40:19 UTC
Verified this Bug on 


# rpm -qa | grep ganesha
nfs-ganesha-2.4.4-15.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.4-15.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-33.el7rhgs.x86_64


Steps:
1. Enable pacemaker service on all the ganesha nodes.
   systemctl enable pacemaker.service
2. Enable nfs-ganesha.
   gluster nfs-ganesha enable
3. Check pcs status.
4.Reboot the node.When node comes up,check pacemaker service is running.

[root@dhcp42-125 yum.repos.d]# systemctl enable pacemaker.service
[root@dhcp42-125 ganesha]# gluster nfs-ganesha enable
Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted pool. Do you still want to continue?
 (y/n) y
This will take a few minutes to complete. Please wait ..
nfs-ganesha : success 

[root@dhcp42-125 ganesha]# pcs status
Cluster name: ganesha-ha-360
Stack: corosync
Current DC: dhcp42-127.lab.eng.blr.redhat.com (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Mon Jul 17 17:05:41 2017
Last change: Mon Jul 17 17:05:25 2017 by root via cibadmin on dhcp42-125.lab.eng.blr.redhat.com

4 nodes configured
24 resources configured

Online: [ dhcp42-119.lab.eng.blr.redhat.com dhcp42-125.lab.eng.blr.redhat.com dhcp42-127.lab.eng.blr.redhat.com dhcp42-129.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp42-119.lab.eng.blr.redhat.com dhcp42-125.lab.eng.blr.redhat.com dhcp42-127.lab.eng.blr.redhat.com dhcp42-129.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp42-119.lab.eng.blr.redhat.com dhcp42-125.lab.eng.blr.redhat.com dhcp42-127.lab.eng.blr.redhat.com dhcp42-129.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp42-119.lab.eng.blr.redhat.com dhcp42-125.lab.eng.blr.redhat.com dhcp42-127.lab.eng.blr.redhat.com dhcp42-129.lab.eng.blr.redhat.com ]
 Resource Group: dhcp42-125.lab.eng.blr.redhat.com-group
     dhcp42-125.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-125.lab.eng.blr.redhat.com
     dhcp42-125.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-125.lab.eng.blr.redhat.com
     dhcp42-125.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-125.lab.eng.blr.redhat.com
 Resource Group: dhcp42-127.lab.eng.blr.redhat.com-group
     dhcp42-127.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-127.lab.eng.blr.redhat.com
     dhcp42-127.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-127.lab.eng.blr.redhat.com
     dhcp42-127.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-127.lab.eng.blr.redhat.com
 Resource Group: dhcp42-129.lab.eng.blr.redhat.com-group
     dhcp42-129.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-129.lab.eng.blr.redhat.com
     dhcp42-129.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-129.lab.eng.blr.redhat.com
     dhcp42-129.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-129.lab.eng.blr.redhat.com
 Resource Group: dhcp42-119.lab.eng.blr.redhat.com-group
     dhcp42-119.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-119.lab.eng.blr.redhat.com
     dhcp42-119.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-119.lab.eng.blr.redhat.com
     dhcp42-119.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-119.lab.eng.blr.redhat.com

Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 10 Soumya Koduri 2017-09-11 12:41:32 UTC
Doc text looks good to me.

Comment 11 Kaleb KEITHLEY 2017-09-11 12:51:08 UTC
agreed, doc text looks okay

Comment 13 errata-xmlrpc 2017-09-21 04:37:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774