Bug 1436141 - [RFE] Extend Capability of Gluster NFS process Failover with CTDB
Summary: [RFE] Extend Capability of Gluster NFS process Failover with CTDB
Keywords:
Status: CLOSED DUPLICATE of bug 1371178
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: ctdb
Version: rhgs-3.1
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Michael Adam
QA Contact: Vivek Das
URL:
Whiteboard:
Depends On:
Blocks: RHGS-3.4-GSS-proposed-tracker
TreeView+ depends on / blocked
 
Reported: 2017-03-27 09:50 UTC by Abhishek Kumar
Modified: 2018-11-21 10:46 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-21 10:46:54 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1370090 0 urgent CLOSED [GSS] - Unable to Failover Gluster NFS with CTDB 2022-03-13 14:05:48 UTC

Description Abhishek Kumar 2017-03-27 09:50:14 UTC
Description of problem:
extend the capability of gnfs process failover with CTDB

Version-Release number of selected component (if applicable):


How reproducible:

Everytime


Here is the configuration steps :

RHEL 6 (Gluster nfs with CTDB):
 
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.8 (Santiago)
 
# cat /etc/redhat-storage-release
Red Hat Gluster Storage Server 3.1 Update 3
 
# rpm -qa ctdb
ctdb-4.4.3-8.el6rhs.x86_64
 
Tested with two scenarios:

1). Without adding "CTDB_MANAGES_NFS=yes" and NFS_HOSTNAME="nfs_ctdb" parameter in /etc/sysconfig/nfs file
 
i. Among two, only one node was healthy(OK). Other one was UNHEALTHY.
 
ii. Even though one node was unhealthy, the failover took place without any problem when node was down.
 
iii. When failover took place, the status of node turned into Healthy(OK).
 
iv. When nfs process was killed manually, the failover didn't take place.

-------------------------------------------------------------------------------------------------------------
 
2). Adding CTDB_MANAGES_NFS=yes and NFS_HOSTNAME="nfs_ctdb" parameter in /etc/sysconfig/nfs
 
i. Among two, only one node was healthy(OK). Other one was UNHEALTHY.
 
ii. Even though one node was unhealthy, the failover took place without any problem when node was down.
 
iii. When failover took place, the status of node turned into Healthy(OK).
 
iv. When nfs process was killed manually, the failover didn't take place.

==============================================

RHEL 7 (Gluster nfs with CTDB):
 
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)
 
# cat /etc/redhat-storage-release
Red Hat Gluster Storage Server 3.1 Update 3

# rpm -qa ctdb
ctdb-4.4.3-8.el7rhgs.x86_64
 
Tested with two scenarios:

1. Without adding CTDB_MANAGES_NFS=yes parameter in /etc/sysconfig/nfs

i. All nodes were Healthy. There was public IP running on one of the node.
 
ii. On client side, mounted volume with NFS (vers=3)
 
iii. When nfs process was killed, then client goes into stale.

But even after some time, it doesn't work out. As gluster NFS service is not monitored by CTDB so failover doesn't takes place.
 
iv. Even though nfs service was killed on one of the node, the cdtb status was showing all nodes as HEALTHY.
 
v. Client starts working only after restarting glusterd daemon or after rebooting the node.
 
======================================================
 
2). Adding CTDB_MANAGES_NFS=yes parameter in /etc/sysconfig/nfs
 
i. All nodes were Healthy. There was public IP running on one of the node.
 
ii. On client side, mounted volume with NFS (vers=3).
 
iii. When nfs process was killed, then client goes into hung state.
 
iv. Failover of public IP took place within approximately 30 seconds. After that, client was working fine.
 
Version-Release number of selected component (if applicable):

RHGS 3.1.3
ctdb-4.4.3-8.el6rhs.x86_64


Actual results:

CTDB doesn't manage gnfs.

Expected results:

ctdb gluster-nfs callout that only monitors but does not start/stop gnfs

Additional info:

Comment 6 Anoop C S 2018-11-21 10:46:54 UTC

*** This bug has been marked as a duplicate of bug 1371178 ***


Note You need to log in before you can comment on or make changes to this bug.