Bug 1228626 - nfs-ganesha: add node fails to add a new node to the cluster
Summary: nfs-ganesha: add node fails to add a new node to the cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nfs-ganesha
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ---
: RHGS 3.1.0
Assignee: Meghana
QA Contact: Saurabh
URL:
Whiteboard:
Depends On:
Blocks: 1202842 1233246 1234216
TreeView+ depends on / blocked
 
Reported: 2015-06-05 10:49 UTC by Saurabh
Modified: 2016-01-19 06:14 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.1-5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1233246 (view as bug list)
Environment:
Last Closed: 2015-07-29 04:58:14 UTC
Embargoed:


Attachments (Terms of Use)
sosreport of nfs5 (11.38 MB, application/x-xz)
2015-06-05 10:56 UTC, Saurabh
no flags Details
sosreport of nfs9 (6.34 MB, application/x-xz)
2015-06-05 10:57 UTC, Saurabh
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Saurabh 2015-06-05 10:49:50 UTC
Description of problem:
I tried to add a node for nfs-ganesha cluster and it has failed.
I started with 4 nodes and then tried to add one more.

Version-Release number of selected component (if applicable):
glusterfs-3.7.0-3.el6rhs.x86_64
nfs-ganesha-2.2.0-0.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. create a volume of type 6x2, start it
2. start nfs-ganesha, after completing the pre-requisites
3. gluster peer probe <new node>
4. pcs cluster auth for the new node(on all nodes including that node)
5. execute the command
   time /usr/libexec/ganesha/ganesha-ha.sh --add /etc/ganesha nfs9 10.70.44.96

Actual results:
step 5 fails,

 262           </lrm_resource>
 263           <lrm_resource class="ocf" id="nfs8-cluster_ip-1" provider="heartbeat" type="IPaddr">
 264             <lrm_rsc_op call-id="61" crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9" exec-time="204" id="nfs8-cluster_ip-1_last_0" interval="0" last-rc-change="1433516461" last-run="1433516461" on_node="nfs8" op-digest="2d9313cea8ae5ad4f2081572c52479f9" op-status="0" operation="start" operation_key="nfs8-cluster_ip-1_start_0" queue-time="0" rc-code="0" transition-key="83:19:0:fabf9728-6e76-4e5a-a423-6076f508dfbc" transition-magic="0:0;83:19:0:fabf9728-6e76-4e5a-a423-6076f508dfbc"/>
 265             <lrm_rsc_op call-id="62" crm-debug-origin="build_active_RAs" crm_feature_set="3.0.9" exec-time="92" id="nfs8-cluster_ip-1_monitor_15000" interval="15000" last-rc-change="1433516461" on_node="nfs8" op-digest="0956e8cc0cc5af9258989f049e793028" op-status="0" operation="monitor" operation_key="nfs8-cluster_ip-1_monitor_15000" queue-time="0" rc-code="0" transition-key="84:19:0:fabf9728-6e76-4e5a-a423-6076f508dfbc" transition-magic="0:0;84:19:0:fabf9728-6e76-4e5a-a423-6076f508dfbc"/>
 266           </lrm_resource>
 267         </lrm_resources>
 268       </lrm>
 269     </node_state>
 270   </status>
 271 </cib>
 272 
Call failed: Update does not conform to the configured schema

Error: Resource '10.70.44.96-cluster_ip-1' does not exist
Error: Resource '10.70.44.96-trigger_ip-1' does not exist
Error: Resource '10.70.44.96-cluster_ip-1' does not exist
Adding nfs5-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs5-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs6-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs6-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs7-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs7-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs8-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs8-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Error: unable to create resource/fence device 'nfs5-cluster_ip-1', 'nfs5-cluster_ip-1' already exists on this system
Error: unable to create resource/fence device 'nfs5-trigger_ip-1', 'nfs5-trigger_ip-1' already exists on this system
Adding nfs5-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs5-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Error: unable to create resource/fence device 'nfs6-cluster_ip-1', 'nfs6-cluster_ip-1' already exists on this system
Error: unable to create resource/fence device 'nfs6-trigger_ip-1', 'nfs6-trigger_ip-1' already exists on this system
Adding nfs6-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs6-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Error: unable to create resource/fence device 'nfs7-cluster_ip-1', 'nfs7-cluster_ip-1' already exists on this system
Error: unable to create resource/fence device 'nfs7-trigger_ip-1', 'nfs7-trigger_ip-1' already exists on this system
Adding nfs7-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs7-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Error: unable to create resource/fence device 'nfs8-cluster_ip-1', 'nfs8-cluster_ip-1' already exists on this system
Error: unable to create resource/fence device 'nfs8-trigger_ip-1', 'nfs8-trigger_ip-1' already exists on this system
Adding nfs8-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs8-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Error: unable to create resource/fence device 'nfs9-cluster_ip-1', 'nfs9-cluster_ip-1' already exists on this system
Error: unable to create resource/fence device 'nfs9-trigger_ip-1', 'nfs9-trigger_ip-1' already exists on this system
Adding nfs9-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs9-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
CIB updated
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/ganesha/nfs5/ganesha': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/statd/nfs5/statd': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/ganesha/nfs6/ganesha': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/statd/nfs6/statd': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/ganesha/nfs7/ganesha': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/statd/nfs7/statd': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/ganesha/nfs8/ganesha': File exists
ln: creating symbolic link `/var/run/gluster/shared_storage/nfs-ganesha/nfs9/nfs/statd/nfs8/statd': File exists
/etc/ganesha/ganesha.conf: line 1: NFS_Core_Param: command not found
/etc/ganesha/ganesha.conf: line 3: Rquota_Port: command not found
/etc/ganesha.conf: No such file or directory
Killed by signal 1.
Connection to nfs5 closed.
/etc/ganesha/exports: No such file or directory
Killed by signal 1.
Connection to nfs5 closed.
Starting ganesha.nfsd: 



Expected results:
add node should happen based on the command described in step 5

Additional info:
If the expectation is that on all nodes the configuration have to be manually updated then that is also not correct, if we are providing the command line then things should be taken care using that only.

Comment 2 Saurabh 2015-06-05 10:53:30 UTC
providing pcs status,

Cluster name: new-ganesha
Last updated: Fri Jun  5 21:46:24 2015
Last change: Fri Jun  5 21:28:14 2015
Stack: cman
Current DC: nfs5 - partition with quorum
Version: 1.1.11-97629de
4 Nodes configured
18 Resources configured


Online: [ nfs5 nfs6 nfs7 nfs8 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs5 nfs6 nfs7 nfs8 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs5 nfs6 nfs7 nfs8 ]
 nfs9-cluster_ip-1	(ocf::heartbeat:IPaddr):	FAILED (unmanaged) [ nfs5 nfs6 nfs7 nfs8 ]
 nfs9-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs5 
 nfs5-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs5 
 nfs5-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs5 
 nfs6-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs6 
 nfs6-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs6 
 nfs7-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs7 
 nfs7-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs7 
 nfs8-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs8 
 nfs8-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs8 

Failed actions:
    nfs9-cluster_ip-1_stop_0 on nfs5 'not configured' (6): call=85, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=49ms
    nfs9-cluster_ip-1_stop_0 on nfs5 'not configured' (6): call=85, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=49ms
    nfs9-cluster_ip-1_stop_0 on nfs6 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:17 2015', queued=0ms, exec=88ms
    nfs9-cluster_ip-1_stop_0 on nfs6 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:17 2015', queued=0ms, exec=88ms
    nfs9-cluster_ip-1_stop_0 on nfs7 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=63ms
    nfs9-cluster_ip-1_stop_0 on nfs7 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=63ms
    nfs9-cluster_ip-1_stop_0 on nfs8 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=91ms
    nfs9-cluster_ip-1_stop_0 on nfs8 'not configured' (6): call=81, status=complete, last-rc-change='Fri Jun  5 21:28:16 2015', queued=0ms, exec=91ms

Comment 3 Saurabh 2015-06-05 10:56:08 UTC
Created attachment 1035167 [details]
sosreport of nfs5

Comment 4 Saurabh 2015-06-05 10:57:04 UTC
Created attachment 1035168 [details]
sosreport of nfs9

Comment 6 Meghana 2015-06-19 05:59:02 UTC
This patch has been posted upstream and is under review.

Comment 8 Saurabh 2015-07-05 10:55:29 UTC
based on the test that I did moving this BZ to verified,

here are the logs,
[root@nfs11 ~]# time bash /usr/libexec/ganesha/ganesha-ha.sh --add /etc/ganesha nfs16 10.70.44.96
ganesha.conf                                                                                                        100% 1329     1.3KB/s   00:00    
tmp.CBWq1Awsbc                                                                                                      100% 1329     1.3KB/s   00:00    
Starting ganesha.nfsd: [  OK  ]
nfs11: Corosync updated
nfs12: Corosync updated
nfs13: Corosync updated
nfs14: Corosync updated
nfs16: Updated cluster.conf...
Adding nfs_start-nfs16 nfs-mon-clone (kind: Mandatory) (Options: first-action=start then-action=start)
CIB updated
nfs16: Starting Cluster...
Removing Constraint - location-nfs_start-nfs16-nfs16-INFINITY
Removing Constraint - order-nfs_start-nfs16-nfs-mon-clone-mandatory
Attempting to stop: nfs_start-nfs16...Stopped
Deleting Resource - nfs_start-nfs16
Removing Constraint - colocation-nfs11-cluster_ip-1-nfs11-trigger_ip-1-INFINITY
Removing Constraint - location-nfs11-cluster_ip-1
Removing Constraint - location-nfs11-cluster_ip-1-nfs12-1000
Removing Constraint - location-nfs11-cluster_ip-1-nfs13-2000
Removing Constraint - location-nfs11-cluster_ip-1-nfs14-3000
Removing Constraint - location-nfs11-cluster_ip-1-nfs11-4000
Removing Constraint - order-nfs-grace-clone-nfs11-cluster_ip-1-mandatory
Deleting Resource - nfs11-cluster_ip-1
Removing Constraint - order-nfs11-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - nfs11-trigger_ip-1
Removing Constraint - colocation-nfs12-cluster_ip-1-nfs12-trigger_ip-1-INFINITY
Removing Constraint - location-nfs12-cluster_ip-1
Removing Constraint - location-nfs12-cluster_ip-1-nfs13-1000
Removing Constraint - location-nfs12-cluster_ip-1-nfs14-2000
Removing Constraint - location-nfs12-cluster_ip-1-nfs11-3000
Removing Constraint - location-nfs12-cluster_ip-1-nfs12-4000
Removing Constraint - order-nfs-grace-clone-nfs12-cluster_ip-1-mandatory
Deleting Resource - nfs12-cluster_ip-1
Removing Constraint - order-nfs12-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - nfs12-trigger_ip-1
Removing Constraint - colocation-nfs13-cluster_ip-1-nfs13-trigger_ip-1-INFINITY
Removing Constraint - location-nfs13-cluster_ip-1
Removing Constraint - location-nfs13-cluster_ip-1-nfs14-1000
Removing Constraint - location-nfs13-cluster_ip-1-nfs11-2000
Removing Constraint - location-nfs13-cluster_ip-1-nfs12-3000
Removing Constraint - location-nfs13-cluster_ip-1-nfs13-4000
Removing Constraint - order-nfs-grace-clone-nfs13-cluster_ip-1-mandatory
Deleting Resource - nfs13-cluster_ip-1
Removing Constraint - order-nfs13-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - nfs13-trigger_ip-1
Removing Constraint - colocation-nfs14-cluster_ip-1-nfs14-trigger_ip-1-INFINITY
Removing Constraint - location-nfs14-cluster_ip-1
Removing Constraint - location-nfs14-cluster_ip-1-nfs11-1000
Removing Constraint - location-nfs14-cluster_ip-1-nfs12-2000
Removing Constraint - location-nfs14-cluster_ip-1-nfs13-3000
Removing Constraint - location-nfs14-cluster_ip-1-nfs14-4000
Removing Constraint - order-nfs-grace-clone-nfs14-cluster_ip-1-mandatory
Deleting Resource - nfs14-cluster_ip-1
Removing Constraint - order-nfs14-trigger_ip-1-nfs-grace-clone-mandatory
Deleting Resource - nfs14-trigger_ip-1
Adding nfs11-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs11-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs12-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs12-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs13-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs13-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs14-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs14-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs16-trigger_ip-1 nfs-grace-clone (kind: Mandatory) (Options: first-action=start then-action=start)
Adding nfs-grace-clone nfs16-cluster_ip-1 (kind: Mandatory) (Options: first-action=start then-action=start)
CIB updated
ganesha-ha.conf                                                                                                     100%  922     0.9KB/s   00:00    
ganesha-ha.conf                                                                                                     100%  922     0.9KB/s   00:00    
ganesha-ha.conf                                                                                                     100%  922     0.9KB/s   00:00    
ganesha-ha.conf                                                                                                     100%  922     0.9KB/s   00:00    
ganesha-ha.conf                                                                                                     100%  922     0.9KB/s   00:00    

real	0m58.806s
user	0m18.423s
sys	0m5.517s
[root@nfs11 ~]# pcs status
Cluster name: nozomer
Last updated: Sun Jul  5 16:21:03 2015
Last change: Sun Jul  5 16:18:39 2015
Stack: cman
Current DC: nfs11 - partition with quorum
Version: 1.1.11-97629de
5 Nodes configured
20 Resources configured


Online: [ nfs11 nfs12 nfs13 nfs14 nfs16 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ nfs11 nfs12 nfs13 nfs14 nfs16 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ nfs11 nfs12 nfs13 nfs14 nfs16 ]
 nfs11-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs11 
 nfs11-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs11 
 nfs12-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs12 
 nfs12-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs12 
 nfs13-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs13 
 nfs13-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs13 
 nfs14-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs14 
 nfs14-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs14 
 nfs16-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started nfs16 
 nfs16-trigger_ip-1	(ocf::heartbeat:Dummy):	Started nfs16

Comment 9 errata-xmlrpc 2015-07-29 04:58:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.