Bug 1113959

Summary:	Spec %post server does not wait for the old glusterd to exit
Product:	[Community] GlusterFS	Reporter:	Kaleb KEITHLEY <kkeithle>
Component:	build	Assignee:	Kaleb KEITHLEY <kkeithle>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.5.1	CC:	gluster-bugs, kkeithle, kparthas, lmohanty, ndevos, pkarampu, puiterwijk
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.5.2beta1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1113543	Environment:
Last Closed:	2014-07-31 11:43:31 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1113543
Bug Blocks:	1104511

Description Kaleb KEITHLEY 2014-06-27 10:49:42 UTC

+++ This bug was initially created as a clone of Bug #1113543 +++

Description of problem:
The %post server of gluster.spec says:
killall glusterd &> /dev/null
glusterd --xlator-option *.upgrade=on -N
This doesn't wait for the old glusterd to actually exit, so the new one sees it cannot bind to the interface and quits, and then the original one quits, leaving no glusterd actually running.


Version-Release number of selected component (if applicable):
glusterfs-3.5.1


How reproducible:
Everytime


Steps to Reproduce:
1. Run glusterd
2. Upgrade from 3.5.0 to 3.5.1
3.

Actual results:
No glusterd running anymore


Expected results:
An upgraded glusterd running


Additional info:

--- Additional comment from Niels de Vos on 2014-06-26 09:35:37 EDT ---

The post installation script for the glusterfs-server handles the restarting of glusterd incorrect. This caused an outage when the glusterfs-server package was automatically updated.

After checking the logs together with Patrick, we came to the conclusion that the running glusterd should have received a signal and would be exiting. However, the script does not wait for the running glusterd to exit, and starts a new glusterd process immediately after sending the SIGTERM. In case the 1st glusterd process has not exited yet, the new glusterd process can not listen on port 24007 and exits. The 1st glusterd will exit eventually too, leaving the service unavailable.

Snippet from the .spec:

 735 %post server
 ...
 769 pidof -c -o %PPID -x glusterd &> /dev/null
 770 if [ $? -eq 0 ]; then
 ...
 773     killall glusterd &> /dev/null
 774     glusterd --xlator-option *.upgrade=on -N
 775 else
 776     glusterd --xlator-option *.upgrade=on -N
 777 fi
 ...


I am not sure what the best way is to start glusterd with these specific options once. Maybe these should get listed in /etc/sysconfig/glusterd so that the standard init-script or systemd-job handles it?

--- Additional comment from Kaleb KEITHLEY on 2014-06-26 11:31:35 EDT ---

Which is the primary concern, that the new glusterd was started too soon? That we need a cleaner solution for starting glusterd with the *.upgrade=on option? Or both?

--- Additional comment from Anand Avati on 2014-06-26 17:18:17 EDT ---

REVIEW: http://review.gluster.org/8185 (build/glusterfs.spec.in: %post server doesn't wait for old glusterd) posted (#1) for review on master by Kaleb KEITHLEY (kkeithle)

--- Additional comment from Anand Avati on 2014-06-27 06:36:21 EDT ---

REVIEW: http://review.gluster.org/8185 (build/glusterfs.spec.in: %post server doesn't wait for old glusterd) posted (#2) for review on master by Kaleb KEITHLEY (kkeithle)

Comment 1 Anand Avati 2014-06-27 11:06:40 UTC

REVIEW: http://review.gluster.org/8190 (build/glusterfs.spec.in: %post server doesn't wait for old glusterd) posted (#1) for review on release-3.5 by Kaleb KEITHLEY (kkeithle)

Comment 2 Anand Avati 2014-06-30 15:39:12 UTC

REVIEW: http://review.gluster.org/8190 (build/glusterfs.spec.in: %post server doesn't wait for old glusterd) posted (#2) for review on release-3.5 by Kaleb KEITHLEY (kkeithle)

Comment 3 Anand Avati 2014-07-02 10:31:28 UTC

COMMIT: http://review.gluster.org/8190 committed in release-3.5 by Niels de Vos (ndevos) 
------
commit cc372dc0d0561b2995d89cab4e84dcebef0c346c
Author: Kaleb S. KEITHLEY <kkeithle>
Date:   Fri Jun 27 07:04:47 2014 -0400

    build/glusterfs.spec.in: %post server doesn't wait for old glusterd
    
    'killall glusterd' needs to wait for the old glusterd to exi
    before starting the updated one, otherwise the new process can't
    bind to its socket ports
    
    Change-Id: I78af70419d8b1ac878ee9711acdc01b308b0e46f
    BUG: 1113959
    Signed-off-by: Kaleb S. KEITHLEY <kkeithle>
    Reviewed-on: http://review.gluster.org/8190
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Lalatendu Mohanty <lmohanty>
    Reviewed-by: Humble Devassy Chirammal <humble.devassy>

Comment 4 Niels de Vos 2014-07-21 15:41:57 UTC

The first (and last?) Beta for GlusterFS 3.5.2 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.2beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-July/041636.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 5 Niels de Vos 2014-07-31 11:43:31 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.2, please reopen this bug report.

glusterfs-3.5.2 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-July/041217.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user