Bug 1305207 - Description for taking "statedump" on GlusterFS native client confusing
Description for taking "statedump" on GlusterFS native client confusing
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: doc-Administration_Guide (Show other bugs)
x86_64 Linux
unspecified Severity low
: ---
: RHGS 3.3.0
Assigned To: Laura Bailey
Depends On:
Blocks: 1417154
  Show dependency treegraph
Reported: 2016-02-06 00:36 EST by Peter Portante
Modified: 2017-09-21 00:24 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-09-21 00:24:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Peter Portante 2016-02-06 00:36:38 EST
Following the PDF found here: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/pdf/Administration_Guide/Red_Hat_Storage-3-Administration_Guide-en-US.pdf

The description of section 14.6 on pages 248 and 249 for how to get statedumps is confusing to follow for native client processes.

1. It might be helpful to use separate sections for server side state dumps and client side state dumps.  The beginning of the description for the client processes follows immediately after a "gluster volume info VOLNAME" command example with no clear delineation.

2. It might be helpful to describe *how* to get the PID of the client process

3. There is a paragraph that mentions the "GlusterFS Management Daemon" which has a "kill -SIGUSR1 PID_of_the_glusterd_process" example, which is confusing because one might this think is now talking about the server side "glusterd" process, throwing into question the previous description about client processes.

4. The reference to "glusterd" in the name format is also confusing

5. For the client side state dumps, it might be helpful to explain to the user that unless the directory /var/run/gluster/ exists no state dump will be taken.
Comment 3 Atin Mukherjee 2016-07-14 00:08:49 EDT

Can you track this for 3.2?

Comment 13 Peter Portante 2017-05-23 22:57:09 EDT
Hi Laura, sorry, but I think somebody more knowledgable in Gluster internals should answer these kinds of questions.
Comment 21 Poornima G 2017-07-12 03:49:27 EDT
Apologies for the delayed response. Isn't the heading "Process-based state dump" little confusing as the client-side statedump section and server-side statedump section both include the command mentioned in "Process-based state dump".
I am not sure if having client side and server side as separate headings, as the command is not really different for client-side and server-side. The usage of the command is:
Usage: volume statedump <VOLNAME> [[nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history]... | [client <hostname:process-id>]]

"kill -SIGUSR1 process_id" works for bricks(server side), FUSE clients(client-side) and most other deamons on the server nodes. That said, we can also use the command "gluster volume statedump <VOLNAME>" to dump the statedump of all brick processes on all nodes/servers. However the above command will not work for: quotad, nfs and gfapi clients(that include samba, nfs-Ganesha, Qemu etc.).

Its quite hard to categorise without overlapping. With above data points, i think:
1. To take statedump of any gluster process exceptions include (gfapi clients, quotad, nfs) use the following command:
kill -SIGUSR1 process_id
2. To take statedump of all the bricks on all the nodes/servers in a single shot, please use the below command:
gluster volume statedump <VOLNAME>

3. To take statedump of any gfapi based processes...:
"# gluster volume statedump  VOLNAME client <host>:<pid>"

4. quotad, and nfs i am not very sure, will add a needinfo on the quota and nfs maintainers.

Your thoughts? i am actually ok with the current model as well, just thinking aloud.

- Some places mention "kill -SIGUSR1 process_id" and in some place it mentions kill -USR1 process_id", both are same and right, but i think there should be uniformity.

- Also the command in 19.8.3:
"# gluster volume statedump  VOLNAME <host>:<pid>"
should be
"# gluster volume statedump  VOLNAME client <host>:<pid>"
Comment 22 Laura Bailey 2017-07-13 00:47:20 EDT
That's okay, hope you had a good holiday! :)

The problem we were trying to solve here is that customers were coming to this section, which had no headings, and seeing client info mixed in with server info. So whatever we do here, I think we need to be really clear about server vs client execution, even if it is a very similar process.

Would it make sense to rework the chapter like this?

- Basic definition of state dump
- Configuring location of state dump file
- Performing a state dump
  - For a single process
  - For all server processes
  - For all client processes

Does this make a bit more sense, Poornima?

Other questions, to be sure I understand the usage:
1) Does this command work if you run it on a client machine 
   or do you need to include host:pid?
     # gluster volume statedump volname client
2) Can you confirm that nfs and quotad keywords are not useful for client-side state dump?
Comment 23 Sanoj Unnikrishnan 2017-07-13 02:42:59 EDT
Note that For Quotad as well we can trigger the state dump through both ways.
1) kill -SIGUSR1 <pid_of_quotad>
2) gluster volume statedump <volname> quotad

For some processes like self heal deamon we do not seem to have a gluster command wrapper to the "kill -SIGUSR1 <pid>".

IMO, we can have a single sections for the commands:
We can highlight the processes that can be dumped using gluster command:
 > gluster volume statedump <VOLNAME>
 > gluster volume statedump <VOLNAME> nfs all
 > gluster volume statedump <VOLNAME> quotad  
 > gluster volume statedump <VOLNAME> client <host>:<pid> 

All these commands trigger state dump by internally performing 
kill -SIGUSR1 <pid>

And then we can specify that for some process like self heal daemon there is no gluster command wrapper yet and hence state dump for such processes can be trig erred explicitly by using.

kill -SIGUSR1 <pid>
Comment 26 Sanoj Unnikrishnan 2017-07-14 06:20:04 EDT
Hi Laura
IMHO we do not even need a client and server separation .. 
we can have single section named "taking a statedump"

we can have the client specific state dump command along with other examples
That way we will not specify kill command multiple times.

THere is a mistake in 19.8.1 
> gluster volume statedump client hostname:pid
should be
> gluster volume statedump VOLNAME client hostname:pid
Comment 29 SATHEESARAN 2017-08-12 21:39:06 EDT
Verified the content.
The content that explains how to get different statedump is clear and no longer confuses the reader
Comment 30 Laura Bailey 2017-08-29 00:10:42 EDT
Fixed in RHGS 3.3 documentation.

Note You need to log in before you can comment on or make changes to this bug.