1305207 – Description for taking "statedump" on GlusterFS native client confusing

Bug 1305207 - Description for taking "statedump" on GlusterFS native client confusing

Summary: Description for taking "statedump" on GlusterFS native client confusing

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	doc-Administration_Guide
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	low
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Laura Bailey
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1417154
TreeView+	depends on / blocked

Reported:	2016-02-06 05:36 UTC by Peter Portante
Modified:	2017-09-21 04:24 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-21 04:24:24 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Peter Portante 2016-02-06 05:36:38 UTC

Following the PDF found here: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/pdf/Administration_Guide/Red_Hat_Storage-3-Administration_Guide-en-US.pdf

The description of section 14.6 on pages 248 and 249 for how to get statedumps is confusing to follow for native client processes.

1. It might be helpful to use separate sections for server side state dumps and client side state dumps.  The beginning of the description for the client processes follows immediately after a "gluster volume info VOLNAME" command example with no clear delineation.

2. It might be helpful to describe *how* to get the PID of the client process

3. There is a paragraph that mentions the "GlusterFS Management Daemon" which has a "kill -SIGUSR1 PID_of_the_glusterd_process" example, which is confusing because one might this think is now talking about the server side "glusterd" process, throwing into question the previous description about client processes.

4. The reference to "glusterd" in the name format is also confusing

5. For the client side state dumps, it might be helpful to explain to the user that unless the directory /var/run/gluster/ exists no state dump will be taken.

Comment 3 Atin Mukherjee 2016-07-14 04:08:49 UTC

Laura,

Can you track this for 3.2?

~Atin

Comment 13 Peter Portante 2017-05-24 02:57:09 UTC

Hi Laura, sorry, but I think somebody more knowledgable in Gluster internals should answer these kinds of questions.

Comment 21 Poornima G 2017-07-12 07:49:27 UTC

Apologies for the delayed response. Isn't the heading "Process-based state dump" little confusing as the client-side statedump section and server-side statedump section both include the command mentioned in "Process-based state dump".
I am not sure if having client side and server side as separate headings, as the command is not really different for client-side and server-side. The usage of the command is:
Usage: volume statedump <VOLNAME> [[nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history]... | [client <hostname:process-id>]]

i.e. 
"kill -SIGUSR1 process_id" works for bricks(server side), FUSE clients(client-side) and most other deamons on the server nodes. That said, we can also use the command "gluster volume statedump <VOLNAME>" to dump the statedump of all brick processes on all nodes/servers. However the above command will not work for: quotad, nfs and gfapi clients(that include samba, nfs-Ganesha, Qemu etc.).

Its quite hard to categorise without overlapping. With above data points, i think:
1. To take statedump of any gluster process exceptions include (gfapi clients, quotad, nfs) use the following command:
kill -SIGUSR1 process_id
...
2. To take statedump of all the bricks on all the nodes/servers in a single shot, please use the below command:
gluster volume statedump <VOLNAME>

3. To take statedump of any gfapi based processes...:
"# gluster volume statedump  VOLNAME client <host>:<pid>"
.....

4. quotad, and nfs i am not very sure, will add a needinfo on the quota and nfs maintainers.

Your thoughts? i am actually ok with the current model as well, just thinking aloud.



- Some places mention "kill -SIGUSR1 process_id" and in some place it mentions kill -USR1 process_id", both are same and right, but i think there should be uniformity.

- Also the command in 19.8.3:
"# gluster volume statedump  VOLNAME <host>:<pid>"
should be
"# gluster volume statedump  VOLNAME client <host>:<pid>"

Comment 22 Laura Bailey 2017-07-13 04:47:20 UTC

That's okay, hope you had a good holiday! :)

The problem we were trying to solve here is that customers were coming to this section, which had no headings, and seeing client info mixed in with server info. So whatever we do here, I think we need to be really clear about server vs client execution, even if it is a very similar process.


Would it make sense to rework the chapter like this?

- Basic definition of state dump
- Configuring location of state dump file
- Performing a state dump
  - For a single process
  - For all server processes
  - For all client processes

Does this make a bit more sense, Poornima?


Other questions, to be sure I understand the usage:
1) Does this command work if you run it on a client machine 
   or do you need to include host:pid?
     # gluster volume statedump volname client
2) Can you confirm that nfs and quotad keywords are not useful for client-side state dump?

Comment 23 Sanoj Unnikrishnan 2017-07-13 06:42:59 UTC

Note that For Quotad as well we can trigger the state dump through both ways.
1) kill -SIGUSR1 <pid_of_quotad>
2) gluster volume statedump <volname> quotad


For some processes like self heal deamon we do not seem to have a gluster command wrapper to the "kill -SIGUSR1 <pid>".

IMO, we can have a single sections for the commands:
We can highlight the processes that can be dumped using gluster command:
 > gluster volume statedump <VOLNAME>
 > gluster volume statedump <VOLNAME> nfs all
 > gluster volume statedump <VOLNAME> quotad  
 > gluster volume statedump <VOLNAME> client <host>:<pid> 
etc..

All these commands trigger state dump by internally performing 
kill -SIGUSR1 <pid>

And then we can specify that for some process like self heal daemon there is no gluster command wrapper yet and hence state dump for such processes can be trig erred explicitly by using.

kill -SIGUSR1 <pid>

Comment 26 Sanoj Unnikrishnan 2017-07-14 10:20:04 UTC

Hi Laura
IMHO we do not even need a client and server separation .. 
we can have single section named "taking a statedump"

we can have the client specific state dump command along with other examples
That way we will not specify kill command multiple times.


THere is a mistake in 19.8.1 
> gluster volume statedump client hostname:pid
should be
> gluster volume statedump VOLNAME client hostname:pid

Comment 29 SATHEESARAN 2017-08-13 01:39:06 UTC

Verified the content.
The content that explains how to get different statedump is clear and no longer confuses the reader

Comment 30 Laura Bailey 2017-08-29 04:10:42 UTC

Fixed in RHGS 3.3 documentation.

Note You need to log in before you can comment on or make changes to this bug.