Bug 1102343

Summary: service squid restart sometimes leaves duplicate processes
Product: Red Hat Enterprise Linux 6 Reporter: Fernando Lozano <fernando>
Component: squidAssignee: Luboš Uhliarik <luhliari>
Status: CLOSED ERRATA QA Contact: Ondřej Pták <optak>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.5CC: agk, dkutalek, optak, ovasik, psimerda, thozza, tlavigne, vanhoof
Target Milestone: rcKeywords: EasyFix
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: squid-3.1.23-8.el6 Doc Type: Bug Fix
Doc Text:
Prior to this update, it was possible to start a new instance of squid while a previous instance was still running. Consequently, the previous instance of squid was running simultaneously with the new instance. This update modifies the squid init script to verify that squid has been terminated before starting a new instance. As a result, the squid init script fails with an error when a new instance is initiated in this scenario, allowing the administrator to properly handle the situation.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 06:28:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
new patch none

Description Fernando Lozano 2014-05-28 19:50:29 UTC
Description of problem:
Sometimes restarting squid leaves old processes running for a short while at the same times as new processes, and so the new processes won't start fine, forcing me to killall squid and start anew.

Version-Release number of selected component (if applicable):
3.3.x/3.4.x

How reproducible:
You have to be (un)lucky. The time to stop squid varies a lot, and the problem only happens if it takes longer than SQUID_SHUTDOWN_TIMEOUT defined in /etc/sysconfig/squid

Steps to Reproduce:
1. service squid restart

Actual results:
squid won't proxy http requests

Expected results:
squid working as usual

Additional info:
I tested this in CentOS 6.5 and using an unofficial RPM package from squid-cache.org. But I checked the squid RPM packates for Fedora and CentOS and the init.d script is the same, so the problem should affect RHEL6.

I think it's better having service squid restart failling than leaving a broken squid, so I propose the following change to /etc/init.d/squid:

------------
restart() {
	stop
	RETVAL=$?
	if [ $RETVAL -eq 0 ] ; then
		rm -rf $SQUID_PIDFILE_DIR/*
		start
	else
		echo "Failure stopping squid or stopping squid took too long. Please check before restarting."
		return 1
        fi
}
-------------

Now the restart() function, instead of blindly calling start() after stop(), it checks the return code from stop(). If stop() timeouts, it returns an error code and so restart() refuses to call start() aftewards.

This little change makes squid more robust for sysadmins who still have to use init.d scripts.

Increasing SQUID_SHUTDOWN_TIMEOUT reduces the frequence of restart failing, but doesn't guarantees success.

Comment 7 Pavel Šimerda (pavlix) 2015-03-18 15:07:36 UTC
Warning: The patch is horribly wrong, don't use it. According to our tests, it just runs "rm -rf /*".

Comment 8 Pavel Šimerda (pavlix) 2015-03-18 15:13:57 UTC
s/patch/code/

Comment 9 Fernando Lozano 2015-03-18 15:21:48 UTC
The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original / upstream script. The change I propose is simply not calling start() inside restart() if the previous stop() failed.

Comment 10 Pavel Šimerda (pavlix) 2015-03-18 15:54:42 UTC
Created attachment 1003314 [details]
new patch

Comment 11 Pavel Šimerda (pavlix) 2015-03-18 16:50:21 UTC
(In reply to Fernando Lozano from comment #9)
> The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original /
> upstream script.

As far as I see not anywhere in the git history, that means since 2004. I suspect you are using different sources than those from RHEL6.

> The change I propose is simply not calling start() inside
> restart() if the previous stop() failed.

When a patch wasn't attached to the bug report, the only way to learn the proposed changes is to compare the code which produced a patch that also added the offending line.

Anyway this is just a warning for anyone looking at this bug report to use the new patch rather than the original code.

Comment 12 Ondřej Pták 2015-04-02 11:53:43 UTC
This fix doesn't work in one specific scenario: when lockfile exists and server is not running. In this case, "service squid restart" should end up with running squid server, but because stop function fails, squid is not started.

Steps to Reproduce:
0. service squid stop
1. touch /var/lock/subsys/squid
2. service squid restart
3. service squid status

Actual results:
Stopping squid:                                            [FAILED]
Squid failed to stop in reasonable time and threfore wasn't started.
squid is stopped

Expected results:
squid (pid  3247) is running...

Notes:
Be careful not to break functionality of condrestart, which should not restart squid if there is lockfile but squid is not running.

Another thing: I'd like to propose change of error message, which is misleading, because in case of existing lockfile and stopped server, stop action fails immediately saying "...reasonable time...". It would be good to:
a) have another message for this case OR
b) change error message to be more general (not only timeout)

Comment 17 Ondřej Pták 2015-06-11 13:34:27 UTC
squid-3.1.10-22.el6_5
======================
It's possible to "keep squid alive" for reproducing this issue by attaching process to debugger. This is test with squid before rebase and it's clear that restart is not working properly. Stopping service fails, but start pass, so return value of restart is 0, although there is the same process as before restarting.

# service squid stop
Stopping squid: ..................................................
# ps aux|grep squid
root     22721  0.0  0.1  73324  3340 ?        Ss   05:39   0:00 squid -f /etc/squid/squid.conf
squid    22723  0.0  0.5  76200 10668 ?        T    05:39   0:04 (squid) -f /etc/squid/squid.conf
squid    22727  0.0  0.0  20084  1076 ?        S    05:39   0:00 (unlinkd)
root     24787  0.1  1.3 204872 26240 pts/1    S+   13:01   0:00 gdb /usr/sbin/squid 22723
root     24893  1.0  0.0 103304   884 pts/0    S+   13:07   0:00 grep squid
# service squid restart
Stopping squid: ..................................................
Starting squid:                                            [  OK  ]
# echo $?
0
# service squid status
squid (pid  22723) is running...


squid-3.1.23-9.el6
==================
# service squid start
Starting squid:                                            [  OK  ]
# ps aux|grep squid
root     25836  0.0  0.1  73972  3488 ?        Ss   13:49   0:00 squid -f /etc/squid/squid.conf
squid    25838  0.1  0.5  76412 10712 ?        S    13:49   0:00 (squid) -f /etc/squid/squid.conf
squid    25842  0.0  0.0  20084  1076 ?        S    13:49   0:00 (unlinkd)
root     25844  0.0  0.0 103304   884 pts/0    S+   13:49   0:00 grep squid
# service squid restart
Stopping squid: ..................................................
Squid failed to stop in reasonable time and therefore wasn't started.
# echo $?
1
# service squid status
squid (pid  25838) is running...


Note:
Every other tests related to init actions passed, except for condrestart/try-restart when squid is not running but lockfile exists (bug 1230753).

Comment 19 errata-xmlrpc 2015-07-22 06:28:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1314.html