1344906 – [nfs-ganesha]: Updates in 3.1.3 administration guide

Bug 1344906 - [nfs-ganesha]: Updates in 3.1.3 administration guide

Summary: [nfs-ganesha]: Updates in 3.1.3 administration guide

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	doc-Administration_Guide
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.1.3
Assignee:	Bhavana
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1311847
TreeView+	depends on / blocked

Reported:	2016-06-12 11:56 UTC by Shashank Raj
Modified:	2016-11-08 03:53 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	If this bug requires documentation, please select an appropriate Doc Type value.
Clone Of:
Environment:
Last Closed:	2016-06-29 14:22:03 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shashank Raj 2016-06-12 11:56:53 UTC

Document URL:

http://jenkinscat.gsslab.pnq.redhat.com:8080/view/Gluster/job/doc-Red_Hat_Gluster_Storage-3.1.3-Administration_Guide%20%28html-single%29/lastBuild/artifact/tmp/en-US/html-single/index.html#sect-NFS_Ganesha

Section Number and Name:

Multiple sections

Describe the issue:

Below changes required for 3.1.3 documentation:

1. Under 7.2.4. NFS-Ganesha, below line needs to be modified:

Current: Red Hat Gluster Storage is supported with the community’s V2.2-stable release of NFS-Ganesha

Update: Red Hat Gluster Storage is supported with the community’s V2.3.1-stable release of NFS-Ganesha

2. Under important section:

Current: You must to ensure to enable the NFS firewall service

Update: You must ensure to enable the NFS firewall service

3. Changes in the support matrix (if any):

Current: The following table contains the feature matrix of the NFS support on Red Hat Gluster Storage 3.1:

Update: a) The following table contains the feature matrix of the NFS support on Red Hat Gluster Storage 3.1.3:

b) Update the changes (if any) in support matrix

4. Dynamic exports to volume:

Current: Note: Modifying an export in place is currently not supported.

Update: This doesn't hold true now

5. Under section, 7.2.4.3.1.Prerequisites to run NFS-Ganesha

Current: For Red Hat Enterprise Linux 6, install pacemaker using the following command
# yum install pacemaker

Update: this sentence needs to be removed and the note of it can also be removed.

6. Under section, 7.2.4.6.NFS-Ganesha Service Downtime

Current: So the maximum total time taken to failover the VIP is (c=a+b) approximately 17 - 22 seconds. In other words, the time taken for NFS clients to detect server reboot or resume I/O is 17 - 22 seconds.

Update: The IO on client resumes after the grace period, so is the 17-22 seconds valid?

Suggestions for improvement:

Additional information:

Comment 2 Shashank Raj 2016-06-12 11:58:40 UTC

@Soumya,

Please confirm the points mentioned in description and update accordingly.

Comment 3 Soumya Koduri 2016-06-12 17:08:20 UTC

Thanks for the suggestions. Few minor corrections -

>>> 4. Dynamic exports to volume:

>>> Current: Note: Modifying an export in place is currently not supported. 

>>> Update: This doesn't hold true now

This does hold true. To modify export options, we unexport and re-export them but modifying in place is not yet supported.


>>> 6. Under section,  7.2.4.6.NFS-Ganesha Service Downtime

>>> Current: So the maximum total time taken to failover the VIP is (c=a+b) approximately 17 - 22 seconds. In other words, the time taken for NFS clients to detect server reboot or resume I/O is 17 - 22 seconds. 

>>> Update: The IO on client resumes after the grace period, so is the 17-22 seconds valid? 

Not all operations are blocked during grace period. Apart from the fops mentioned in the next section "7.2.4.6.1. Modifying the Fail-over Time", rest all of them can be resumed just post failover. However,  I think we need to cross check if the monitor/grace resource agents interval is changed which may effect failover time.

Comment 4 Shashank Raj 2016-06-13 07:06:32 UTC

Kaleb,

Can you please check the point 6 mentioned in comment 3 and confirm, if any modifications are required there?

Comment 5 Kaleb KEITHLEY 2016-06-13 11:16:07 UTC

I'm not sure what a and b are in the equation c=a+b.

1) The ganesha_mon resource agent runs every 20 seconds.  If the ganesha.nfsd dies (crashes, oomkill, admin kill), the maximum time to detect it is 20sec, plus whatever time pacemaker needs to effect the fail-over.

2) if the whole node dies (including network failure) then it's whatever time pacemaker needs to detect that the gone, plus the time to effect the fail-over. Empirically we see this to be < 5sec typically.

So _max_ fail-over time is approx 21-22 seconds, and average times are typically less.

Comment 6 Shashank Raj 2016-06-14 06:50:50 UTC

So, the final changes will be:

1. Under 7.2.4. NFS-Ganesha, below line needs to be modified:

Current: Red Hat Gluster Storage is supported with the community’s V2.2-stable release of NFS-Ganesha

Update: Red Hat Gluster Storage is supported with the community’s V2.3.1-stable release of NFS-Ganesha

2. Under important section:

Current: You must to ensure to enable the NFS firewall service

Update: You must ensure to enable the NFS firewall service

3. Changes in the support matrix (if any):

Current: The following table contains the feature matrix of the NFS support on Red Hat Gluster Storage 3.1: 

Update: The following table contains the feature matrix of the NFS support on Red Hat Gluster Storage 3.1.3: 

4. Under section, 7.2.4.3.1.Prerequisites to run NFS-Ganesha

Current: For Red Hat Enterprise Linux 6, install pacemaker using the following command 
		# yum install pacemaker

Update: this sentence needs to be removed and the note of it can also be removed.

5. Under section,  7.2.4.6.NFS-Ganesha Service Downtime

Update:

The following list describes how the time taken for the NFS server to detect a server reboot or resume is calculated. 

1) If the ganesha.nfsd dies (crashes, oomkill, admin kill), the maximum time to detect it and put the ganesha cluster into grace is 20sec, plus whatever time pacemaker needs to effect the fail-over.

Note:

 This time taken to detect if the service is down, can be edited using the following command on all the nodes:

# pcs resource op remove nfs-mon monitor
# pcs resource op add nfs-mon monitor interval=<interval_period_value>

2) if the whole node dies (including network failure) then this down time is whatever time pacemaker needs to detect that node has gone, the time to put cluster into grace, plus the time to effect the fail-over. which is also ~20 seconds.

So _max_ fail-over time is approx 20-22 seconds, and average times are typically less.

In other words, the time taken for NFS clients to detect server reboot or resume I/O is 20 - 22 seconds. 


>>>>>>>> Plus, below section needs to be removed from 3.1.3 installation guide

>>>> 4.2. Installing NFS-Ganesha during an ISO Installation

remove the 3rd point

>>>> 4.3. Installing NFS-Ganesha using yum

 For Red Hat Enterprise Linux 6: Install Pacemaker and the glusterfs-ganesha package: 

RHEL 6 point can be clubbed with RHEL 7 point.

Comment 8 Bhavana 2016-06-14 11:25:23 UTC



The changes are made to the admin guide:

http://jenkinscat.gsslab.pnq.redhat.com:8080/view/Gluster/job/doc-Red_Hat_Gluster_Storage-3.1.3-Administration_Guide%20%28html-single%29/lastBuild/artifact/tmp/en-US/html-single/index.html#sect-NFS_Ganesha

The changes are made to the install guide:

http://jenkinscat.gsslab.pnq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.1-Installation_Guide%20%28html-single%29/lastStableBuild/artifact/tmp/en-US/html-single/index.html#idm140335734315040


Let me know if this looks ok.

Comment 9 Soumya Koduri 2016-06-14 12:21:20 UTC

The changes look good to me.

Comment 11 Shashank Raj 2016-06-14 14:40:10 UTC

Verified this bug based on the content in doc links provided in comment 8. All the requested changes have been made and it looks good now.

Based on the above observation, marking this bug as Verified.

Note You need to log in before you can comment on or make changes to this bug.