Bug 1259221 - Add node of nfs-ganesha not working on rhel7.1
Summary: Add node of nfs-ganesha not working on rhel7.1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nfs-ganesha
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: RHGS 3.1.1
Assignee: Soumya Koduri
QA Contact: Apeksha
URL:
Whiteboard:
Depends On:
Blocks: 1251815 1259225
TreeView+ depends on / blocked
 
Reported: 2015-09-02 08:47 UTC by Apeksha
Modified: 2015-10-05 10:43 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.7.1-15
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1259225 (view as bug list)
Environment:
Last Closed: 2015-10-05 10:43:43 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1846 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.1 update 2015-10-05 14:43:08 UTC

Description Apeksha 2015-09-02 08:47:54 UTC
Description of problem:
Add node of nfs-ganesha not working on rhel7.1

Version-Release number of selected component (if applicable):
glusterfs-3.7.1-13.el7rhgs.x86_64
nfs-ganesha-2.2.0-6.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup a 2 node cluster
2. Now peer probe a new node and follow all the pre requisites
3. Now the add node script fails

Actual results:
Add node script fails

Expected results: Add node script must be successfull


Additional info:

Comment 3 Apeksha 2015-09-02 09:20:54 UTC
After changing the permissions of secret.pem file,

[root@dhcp37-137 ~]# /usr/libexec/ganesha/ganesha-ha.sh --add /etc/ganesha dhcp37-100.lab.eng.blr.redhat.com 10.70.36.219
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
tmp.8gfaQYPwGm                                                                      100%    0     0.0KB/s   00:00    
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
/tmp/tmp.rvg8s6iYLy/exports: No such file or directory
Unknown operation 'nfs-ganesha'.
dhcp37-137.lab.eng.blr.redhat.com: Corosync updated
dhcp37-56.lab.eng.blr.redhat.com: Corosync updated
dhcp37-100.lab.eng.blr.redhat.com: Succeeded
Adding nfs_start-dhcp37-100.lab.eng.blr.redhat.com nfs-mon-clone (kind: Mandatory) (Options: first-action=start then-action=start)
CIB updated
dhcp37-100.lab.eng.blr.redhat.com: Starting Cluster...
Removing Constraint - location-nfs_start-dhcp37-100.lab.eng.blr.redhat.com-dhcp37-100.lab.eng.blr.redhat.com-INFINITY
Removing Constraint - order-nfs_start-dhcp37-100.lab.eng.blr.redhat.com-nfs-mon-clone-mandatory
Deleting Resource - nfs_start-dhcp37-100.lab.eng.blr.redhat.com
Removing Constraint - colocation-dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-137.lab.eng.blr.redhat.com-trigger_ip-1-INFINITY
Removing Constraint - location-dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1
Removing Constraint - location-dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-56.lab.eng.blr.redhat.com-1000
Removing Constraint - location-dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-137.lab.eng.blr.redhat.com-2000
Removing Constraint - order-nfs-grace-clone-dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1-mandatory
Deleting Resource - dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1
Removing Constraint - order-dhcp37-137.lab.eng.blr.redhat.com-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - dhcp37-137.lab.eng.blr.redhat.com-trigger_ip-1
Removing Constraint - colocation-dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-56.lab.eng.blr.redhat.com-trigger_ip-1-INFINITY
Removing Constraint - location-dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1
Removing Constraint - location-dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-137.lab.eng.blr.redhat.com-1000
Removing Constraint - location-dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1-dhcp37-56.lab.eng.blr.redhat.com-2000
Removing Constraint - order-nfs-grace-clone-dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1-mandatory
Deleting Resource - dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1
Removing Constraint - order-dhcp37-56.lab.eng.blr.redhat.com-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - dhcp37-56.lab.eng.blr.redhat.com-trigger_ip-1
Adding dhcp37-137.lab.eng.blr.redhat.com-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding dhcp37-56.lab.eng.blr.redhat.com-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding dhcp37-100.lab.eng.blr.redhat.com-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone dhcp37-100.lab.eng.blr.redhat.com-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
CIB updated
ganesha-ha.conf                                                                     100% 1112     1.1KB/s   00:00    
ganesha-ha.conf                                                                     100% 1112     1.1KB/s   00:00    
ganesha-ha.conf                                                                     100% 1112     1.1KB/s   00:00    


nfs-ganesha process dis not start on the newly added node

pcs status output:
root@dhcp37-137 ~]# pcs status
Cluster name: G1441129758.55
Last updated: Wed Sep  2 04:02:41 2015
Last change: Wed Sep  2 04:02:34 2015
Stack: corosync
Current DC: dhcp37-137.lab.eng.blr.redhat.com (1) - partition with quorum
Version: 1.1.12-a14efad
3 Nodes configured
13 Resources configured


Online: [ dhcp37-100.lab.eng.blr.redhat.com dhcp37-137.lab.eng.blr.redhat.com dhcp37-56.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp37-100.lab.eng.blr.redhat.com dhcp37-137.lab.eng.blr.redhat.com dhcp37-56.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp37-100.lab.eng.blr.redhat.com dhcp37-137.lab.eng.blr.redhat.com dhcp37-56.lab.eng.blr.redhat.com ]
 dhcp37-137.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Stopped 
 dhcp37-137.lab.eng.blr.redhat.com-trigger_ip-1	(ocf::heartbeat:Dummy):	Started dhcp37-137.lab.eng.blr.redhat.com 
 dhcp37-56.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-56.lab.eng.blr.redhat.com 
 dhcp37-56.lab.eng.blr.redhat.com-trigger_ip-1	(ocf::heartbeat:Dummy):	Started dhcp37-56.lab.eng.blr.redhat.com 
 dhcp37-100.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp37-56.lab.eng.blr.redhat.com 
 dhcp37-100.lab.eng.blr.redhat.com-trigger_ip-1	(ocf::heartbeat:Dummy):	Started dhcp37-56.lab.eng.blr.redhat.com 
 dhcp37-100-dead_ip-1	(ocf::heartbeat:Dummy):	Started dhcp37-100.lab.eng.blr.redhat.com 

Failed actions:
    nfs-grace_monitor_5000 on dhcp37-56.lab.eng.blr.redhat.com 'unknown error' (1): call=109, status=Timed Out, exit-reason='none', last-rc-change='Tue Sep  1 04:49:30 2015', queued=0ms, exec=0ms
    nfs-grace_monitor_5000 on dhcp37-137.lab.eng.blr.redhat.com 'unknown error' (1): call=109, status=Timed Out, exit-reason='none', last-rc-change='Tue Sep  1 04:49:25 2015', queued=0ms, exec=20003ms


PCSD Status:
  dhcp37-137.lab.eng.blr.redhat.com: Online
  dhcp37-56.lab.eng.blr.redhat.com: Online
  dhcp37-100.lab.eng.blr.redhat.com: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Comment 4 Soumya Koduri 2015-09-02 11:56:59 UTC
There is an another issue here. If the 'add-node' is performed from the node which is HA_VOL_SERVER, the export config state is not properly copied. The reason being paswordless scp of the config files shall not work if the source and destination node are found to be same. This needs to be fixed as well in the HA script.

Comment 5 Soumya Koduri 2015-09-03 09:24:08 UTC
Fixes are merged upstream (bug1259225). 
Please note that the above mentioned 'scp' issue shall not happen if the secret.pem file is copied to all the nodes including the one where it has been generated (as already documented in RHGS3.1 admin guide). So we shall backport only the fix for 'systemctl' command path.

Comment 9 Soumya Koduri 2015-09-08 13:35:29 UTC
As mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1259221#c5 , we do not need to backport the other fix (upstream 12091) as we have documented in RHGS 3.1 admin guide that ssh-copy-id of secret.pem file needs to be done on all the nodes including the one where it is generated.

Comment 10 Apeksha 2015-09-16 05:31:28 UTC
Ganesha Add node works fine on rhel7.1 with glusterfs-3.7.1-15.el7rhgs.x86_64

Comment 12 errata-xmlrpc 2015-10-05 10:43:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1846.html


Note You need to log in before you can comment on or make changes to this bug.