Bug 1021035

Summary: [sos] - Error "glusterfsd: no process killed" when running sosreport.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ben Turner <bturner>
Component: sosAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED EOL QA Contact: Ben Turner <bturner>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.1CC: bmr, rhs-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-03 17:16:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Turner 2013-10-18 19:58:38 UTC
Description of problem:

When I run sosreport on my glusterfs client I see the following error:

glusterfsd: no process killed

I don't see in the plugin anything that references glusterfsd and I am not sure where this error message is coming from.  I don't see any negative effects from the error, but I don't see why anything todo with sosreport should be killing glusterfsd.

Version-Release number of selected component (if applicable):

sos-2.2-44.el6.noarch

How reproducible:

Everytime.

Steps to Reproduce:
1.  Run sosreport on a client mounting a RHS volume over glusterfs.
2.
3.

Actual results:

[root@localhost ~]# sosreport 

sosreport (version 2.2)

This utility will collect some detailed  information about the
hardware and setup of your Red Hat Enterprise Linux system.
The information is collected and an archive is  packaged under
/tmp, which you can send to a support representative.
Red Hat Enterprise Linux will use this information for diagnostic purposes ONLY
and it will be considered confidential information.

This process may take a while to complete.
No changes will be made to your system.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [sysreg-prod]: 6.4client
Please enter the case number that you are generating this report for [None]: 1020995

glusterfsd: no process killed
  Running plugins. Please wait ...

  Completed [55/55] ...      
Creating compressed archive...

Your sosreport has been generated and saved in:
  /tmp/sosreport-6.4client.1020995-20131018193209-2f16.tar.xz

The md5sum is: 1a12dcd3b990b8fdee0f30eb44882f16

Please send this file to your support representative.

Expected results:

No error saying, "glusterfsd: no process killed".

Additional info:

Comment 2 Bryn M. Reeves 2014-07-09 09:22:25 UTC
The gluster plugin sends SIGUSR1 to the glusterfs and glusterfsd processes to cause them to generate statedump information:

        self.make_preparations(self.statedump_dir)
        #self.collectExtOutput("killall -USR1 glusterfs glusterfsd")
        os.system("killall -USR1 glusterfs glusterfsd");
        # let all the processes catch the signal and create statedump file
        # entries.
        time.sleep(1)
        self.wait_for_statedump(self.statedump_dir)
        self.addCopySpec(self.statedump_dir)

This change was made by the gluster team as the previous version of the plugin use a gluster command to request the dumps that has side-effects for other hosts in the cluster.

If there's no running glusterfs/glusterfsd then the 'glusterfsd: no process killed' from the killall will leak through to the terminal since the plugin is using os.system directly; it should use self.checkExtProg() - I fixed this in the upstream version a few months back:

commit adc1be1ecab468aed5dbac64b7b7d1ce0cc92180
Author: Bryn M. Reeves <bmr>
Date:   Tue Apr 8 13:48:40 2014 +0100

    Replace os.system() in gluster plugin with self.check_ext_prog()
    
    Plugins should not open-code calls to external commands. Use the
    build-in check_ext_prog() interface instead and test for success
    before attempting to collect the statedump files.
    
    Signed-off-by: Bryn M. Reeves <bmr>

This still logs a warning if the signal couldn't be delivered. Since that's entirely normal on a system with gluster installed but not running I demoted it to an 'info' (-v) level message this week:

commit 255c0e5a043d9f36c5ed0d2cb9b61950f2642f6c
Author: Bryn M. Reeves <bmr>
Date:   Tue Jul 8 20:54:13 2014 +0100

    [gluster] log killall failures at info level
    
    Failing to send a signal to the gluster daemons will happen on
    any system with gluster installed but not running; don't log it
    at warning level.
    
    Signed-off-by: Bryn M. Reeves <bmr>

Really it should also be using a python interface for delivering the signal but that can be fixed later.

Comment 3 Vivek Agarwal 2015-12-03 17:16:50 UTC
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.