Bug 1699381

Summary: [RFE] sos for the gluster commands
Product: Red Hat Enterprise Linux 7 Reporter: Bhushan Ranpise <branpise>
Component: sosAssignee: Pavel Moravec <pmoravec>
Status: CLOSED ERRATA QA Contact: Miroslav HradĂ­lek <mhradile>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.7CC: agk, bmr, jhunsaker, mhradile, myllynen, plambri, pmoravec, sasundar, sbradley
Target Milestone: rcKeywords: FutureFeature, OtherQA
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: sos-3.8-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 20:04:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bhushan Ranpise 2019-04-12 14:37:35 UTC
Description of problem:

I have a case- 02358588, where the customer  wants the gluster-related commands to be captured by the gluster sos plugin itself:

- gluster vol info
- gluster vol status
- gluster peer status
- gluster volume get <volname> all (for all volumes)
- gluster get-state
- "killall -USR1 glusterd" on the node where you see the problem and in one node without it to compare to see glusterd state info
- "for gpid in $(ps aux | grep "glusterfs " | grep :_ | awk '{print $2}') ; do kill -USR1 $gpid ; done" on node with hanging VM to collect glusterfs state info
- ps -Ll -p <pid of a zombie process>
- cat /proc/<pid of a zombie process>/task/*/stack


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 6 Pavel Moravec 2019-04-15 06:27:46 UTC
Following up on #c5:

- new to add:
gluster volume get <volname> all (for all volumes)
gluster get-state
killall -USR1 glusterd    (to be added to glusterfs/glusterfsd killall cmd, under dump option)
for gpid in $(ps aux | grep "glusterfs " | grep :_ | awk '{print $2}') ; do kill -USR1 $gpid ; done   (also under dump option)

ps -Ll -p <pid of a zombie process>
cat /proc/<pid of a zombie process>/task/*/stack

I recall there were some concerns about /proc collection, but havent found anything specific to tasks/stack - this seem to me safe.

Question1: Shall we do so for *all* zombie processes, even unrelated to gluster? Or how to filter what zombie processes to get details for?

Question2: We can also enable the '-k gluster.dump=on' via some new RHHI preset in sosreport - just let me know how to determine the system belongs to RHHI (i.e. some specific package installed?).

Comment 7 Pavel Moravec 2019-04-15 10:00:36 UTC
Further, how much sense it has to take /proc/<pid of a zombie process>/task/*/stack as a zombie process is just a PID (so any stack must be empty)?

Comment 8 Jake Hunsaker 2019-04-15 13:45:42 UTC
> Question2: We can also enable the '-k gluster.dump=on' via some new RHHI
> preset in sosreport - just let me know how to determine the system belongs
> to RHHI (i.e. some specific package installed?).


A preset could work - from BZ1699401 we'll be adding a RHHI profile to sos-collector, so we can get it from both directions here.

RHHI requires the presence of _both_ the gluster and rhv packages. For the nodes that means the qemu-kvm-rhev package.

Comment 19 Pavel Moravec 2019-08-27 10:10:00 UTC
Jake,
could you pls. verify this BZ?

A yum repository for the build of sos-3.8-1.el7 (task 23200819) is available at:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/1.el7/

You can install the rpms locally by putting this .repo file in your /etc/yum.repos.d/ directory:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/1.el7/sos-3.8-1.el7.repo

RPMs and build logs can be found in the following locations:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/1.el7/noarch/

The full list of available rpms is:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/1.el7/noarch/sos-3.8-1.el7.src.rpm
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/1.el7/noarch/sos-3.8-1.el7.noarch.rpm

Build output will be available for the next 21 days.

Comment 32 Pavel Moravec 2019-10-09 06:36:34 UTC
Hello,
could you please verify this BZ? Thanks in advance.


A yum repository for the build of sos-3.8-3.el7 (task 23918180) is available at:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/3.el7/

You can install the rpms locally by putting this .repo file in your /etc/yum.repos.d/ directory:

http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/3.el7/sos-3.8-3.el7.repo

RPMs and build logs can be found in the following locations:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/3.el7/noarch/

The full list of available rpms is:
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/3.el7/noarch/sos-3.8-3.el7.src.rpm
http://brew-task-repos.usersys.redhat.com/repos/official/sos/3.8/3.el7/noarch/sos-3.8-3.el7.noarch.rpm

Build output will be available for the next 21 days.



(the independent (/var)/run : will folowup on it later).

Comment 40 errata-xmlrpc 2020-03-31 20:04:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1127