Bug 796998

Summary: gluster 3.3 beta2 /etc/init.d/glusterd restart kills glusterfsd, which I don't think it's supposed to do
Product: Red Hat Gluster Storage Reporter: csb sysadmin <admin>
Component: glusterfsAssignee: Amar Tumballi <amarts>
Status: CLOSED ERRATA QA Contact: Ben Turner <bturner>
Severity: medium Docs Contact:
Priority: high    
Version: 2.0CC: bturner, gluster-bugs, rfortier, sdharane, shaines, vbellur, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0qa5-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-23 22:32:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description csb sysadmin 2012-02-24 01:25:43 UTC
Description of problem:

gluster 3.3 beta2 "/etc/init.d/glusterd restart" kills glusterfsd, which Joe Julian says is not supposed to. It should just restart the console management daemon. Looks like 3.2.4 has the same issue too. This causes active reads/writes to fail.

/etc/init.d/glusterd:

    11 GLUSTERFSD=glusterfsd

     28 # Stop the service $BASE
     29 stop()
     30 {
     31        echo -n $"Stopping $BASE:"
     32        killproc $BASE
     33        echo
     34        pidof -c -o %PPID -x $GLUSTERFSD &> /dev/null
     35        [ $? -eq 0 ] &&  killproc $GLUSTERFSD &> /dev/null

Version-Release number of selected component (if applicable): 

3.3 beta2


How reproducible:

always


Steps to Reproduce:
1. /etc/init.d/glusterd restart
2. active reads and writes stop / crash since glusterfsd is killed
  
Actual results:

active reads and writes stop / crash since glusterfsd is killed

Expected results:

active reads and writes should not stop since only the glusterd management process should be restarted

Additional info:

Comment 1 Amar Tumballi 2012-03-13 12:34:43 UTC
patch sent : http://review.gluster.com/2919

Comment 2 Anand Avati 2012-05-19 01:42:22 UTC
CHANGE: http://review.gluster.com/2919 (init.d: stop only 'glusterd' process on '/etc/init.d/glusterd stop') merged in master by Anand Avati (avati)

Comment 3 Amar Tumballi 2012-05-21 12:40:59 UTC
Not a blocker. Patch already pushed on upstream, after more baking time, lets pull it in. Also it has some implication on upgrade process, hence not treating it as blocker.

Comment 4 Amar Tumballi 2012-05-26 04:13:56 UTC
not planing to have it in RHS 2.0 right away, closed upstream.

Comment 6 Amar Tumballi 2012-06-01 06:53:53 UTC
the bug fix is only in upstream, not in release-3.3. Hence moving it out of the ON_QA, and setting MODIFIED (as a standard practice @ Red Hat)

Comment 7 Ben Turner 2012-12-17 20:15:23 UTC
It looks like this has affected the stop functionality of the init script as it leaves glusterfsd running when service glusterd stop is ran.  This jumped out at me in self heal testing, when running:

service glusterd stop on node 2
create a bunch of files on client
check the brick on node 2, files exist

When stop is executed, should glusterfsd still be running?  If it should be stopped maybe we could do something like:

if [ $1 = "stop" ]; then
    <kill glusterfsd>
fi

Comment 8 Amar Tumballi 2013-01-11 07:23:49 UTC
Ben, IMO, if command is "service glusterd stop", it should stop only glusterd, and not glusterfsd. That is because glusterd is a management daemon, and we should not be stopping data path process glusterfsd for the same.

If one is doing some script to kill the brick, then please use 'kill glusterfsd' in the script itself.

Comment 9 Ben Turner 2013-01-14 17:46:10 UTC
Verified on glusterfs-3.4.0qa5-1.

Comment 10 Scott Haines 2013-09-23 22:32:36 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html