Bug 1213352
Summary: | nfs-ganesha: HA issue, the iozone process is not moving ahead, once the nfs-ganesha is killed | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Saurabh <saujain> | ||||||
Component: | ganesha-nfs | Assignee: | Kaleb KEITHLEY <kkeithle> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.7.0 | CC: | bugs, mzywusko, sankarshan | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-03-06 17:50:09 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Saurabh
2015-04-20 11:12:28 UTC
Need the following information, 1. showmount -e VIP output 2.NFS-Ganesha logs 3. pcs status output So I am having four nodes, namely nfs[1,2,3,4] nfs-ganehsa came up only on nfs2 and nfs3 and presently I killed nfs-ganesha process on nfs2 so collected the showmount output from nfs3, [root@nfs3 ~]# showmount -e 10.70.36.217 Export list for 10.70.36.217: /vol0 (everyone) [root@nfs3 ~]# showmount -e 10.70.36.218 Export list for 10.70.36.218: /vol0 (everyone) [root@nfs3 ~]# showmount -e 10.70.36.219 Export list for 10.70.36.219: /vol0 (everyone) [root@nfs3 ~]# showmount -e 10.70.36.220 Export list for 10.70.36.220: /vol0 (everyone) node 1, ##################################### [root@nfs1 ~]# ps -eaf | grep nfs root 5338 6760 0 14:57 pts/0 00:00:00 grep nfs [root@nfs1 ~]# pcs status Cluster name: ganesha-ha-2 Last updated: Mon Apr 20 14:58:03 2015 Last change: Mon Apr 20 12:28:04 2015 Stack: cman Current DC: nfs1 - partition with quorum Version: 1.1.11-97629de 4 Nodes configured 22 Resources configured Online: [ nfs1 nfs2 nfs3 nfs4 ] Full list of resources: Clone Set: nfs_start-clone [nfs_start] nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs3 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs1 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs2 (unmanaged) Stopped: [ nfs4 ] nfs1-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs4 Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs1 nfs2 nfs3 nfs4 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs1 nfs2 nfs3 nfs4 ] nfs1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs1-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs2 nfs2-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs2 nfs3-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs3-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs4-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs1 Failed actions: nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms node 2, ########################################## [root@nfs2 ~]# ps -eaf | grep nfs root 5260 16826 0 14:58 pts/0 00:00:00 grep nfs root 6216 1 0 12:27 ? 00:00:05 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid [root@nfs2 ~]# pcs status Cluster name: ganesha-ha-2 Last updated: Mon Apr 20 14:58:49 2015 Last change: Mon Apr 20 12:28:04 2015 Stack: cman Current DC: nfs1 - partition with quorum Version: 1.1.11-97629de 4 Nodes configured 22 Resources configured Online: [ nfs1 nfs2 nfs3 nfs4 ] Full list of resources: Clone Set: nfs_start-clone [nfs_start] nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs3 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs1 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs2 (unmanaged) Stopped: [ nfs4 ] nfs1-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs4 Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs1 nfs2 nfs3 nfs4 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs1 nfs2 nfs3 nfs4 ] nfs1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs1-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs2 nfs2-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs2 nfs3-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs3-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs4-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs1 Failed actions: nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms node 3, ############################################# [root@nfs3 ~]# ps -eaf | grep nfs root 20901 18085 0 14:59 pts/0 00:00:00 grep nfs root 26369 1 0 12:27 ? 00:00:05 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT -p /var/run/ganesha.nfsd.pid [root@nfs3 ~]# pcs status Cluster name: ganesha-ha-2 Last updated: Mon Apr 20 14:59:22 2015 Last change: Mon Apr 20 12:28:04 2015 Stack: cman Current DC: nfs1 - partition with quorum Version: 1.1.11-97629de 4 Nodes configured 22 Resources configured Online: [ nfs1 nfs2 nfs3 nfs4 ] Full list of resources: Clone Set: nfs_start-clone [nfs_start] nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs3 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs1 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs2 (unmanaged) Stopped: [ nfs4 ] nfs1-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs4 Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs1 nfs2 nfs3 nfs4 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs1 nfs2 nfs3 nfs4 ] nfs1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs1-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs2 nfs2-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs2 nfs3-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs3-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs4-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs1 Failed actions: nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms node 4, ###################################### [root@nfs4 ~]# ps -eaf | grep nfs root 16073 27004 0 04:12 pts/0 00:00:00 grep nfs [root@nfs4 ~]# pcs status Cluster name: ganesha-ha-2 Last updated: Mon Apr 20 04:13:00 2015 Last change: Mon Apr 20 01:41:11 2015 Stack: cman Current DC: nfs1 - partition with quorum Version: 1.1.11-97629de 4 Nodes configured 22 Resources configured Online: [ nfs1 nfs2 nfs3 nfs4 ] Full list of resources: Clone Set: nfs_start-clone [nfs_start] nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs3 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs1 (unmanaged) nfs_start (ocf::heartbeat:ganesha_nfsd): FAILED nfs2 (unmanaged) Stopped: [ nfs4 ] nfs1-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs4 Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs1 nfs2 nfs3 nfs4 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs1 nfs2 nfs3 nfs4 ] nfs1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs1-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs2 nfs2-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs2 nfs3-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs3-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs4-trigger_ip-1 (ocf::heartbeat:Dummy): Started nfs3 nfs4-dead_ip-1 (ocf::heartbeat:Dummy): Started nfs1 Failed actions: nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs3 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs1 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40001ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms nfs_start_stop_0 on nfs2 'unknown error' (1): call=20, status=Timed Out, last-rc-change='Mon Apr 20 12:27:09 2015', queued=0ms, exec=40002ms Created attachment 1016358 [details]
nfs-ganesha logs from nfs2
Created attachment 1016359 [details]
nfs-ganesha logs from nfs3
Saurabh, can you make sure if you used VIP of the server to mount the volume on client? Without it, failover will invariably fail. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |