1102343 – service squid restart sometimes leaves duplicate processes

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1102343 - service squid restart sometimes leaves duplicate processes

Summary: service squid restart sometimes leaves duplicate processes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	squid
Sub Component:
Version:	6.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Luboš Uhliarik
QA Contact:	Ondřej Pták
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-05-28 19:50 UTC by Fernando Lozano
Modified:	2015-07-22 06:28 UTC (History)
CC List:	8 users (show)
Fixed In Version:	squid-3.1.23-8.el6
Doc Type:	Bug Fix
Doc Text:	Prior to this update, it was possible to start a new instance of squid while a previous instance was still running. Consequently, the previous instance of squid was running simultaneously with the new instance. This update modifies the squid init script to verify that squid has been terminated before starting a new instance. As a result, the squid init script fails with an error when a new instance is initiated in this scenario, allowing the administrator to properly handle the situation.
Clone Of:
Environment:
Last Closed:	2015-07-22 06:28:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
new patch (346 bytes, patch) 2015-03-18 15:54 UTC, Pavel Šimerda (pavlix)	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:1314	0	normal	SHIPPED_LIVE	squid bug fix and enhancement update	2015-07-20 17:53:27 UTC

Description Fernando Lozano 2014-05-28 19:50:29 UTC

Description of problem:
Sometimes restarting squid leaves old processes running for a short while at the same times as new processes, and so the new processes won't start fine, forcing me to killall squid and start anew.

Version-Release number of selected component (if applicable):
3.3.x/3.4.x

How reproducible:
You have to be (un)lucky. The time to stop squid varies a lot, and the problem only happens if it takes longer than SQUID_SHUTDOWN_TIMEOUT defined in /etc/sysconfig/squid

Steps to Reproduce:
1. service squid restart

Actual results:
squid won't proxy http requests

Expected results:
squid working as usual

Additional info:
I tested this in CentOS 6.5 and using an unofficial RPM package from squid-cache.org. But I checked the squid RPM packates for Fedora and CentOS and the init.d script is the same, so the problem should affect RHEL6.

I think it's better having service squid restart failling than leaving a broken squid, so I propose the following change to /etc/init.d/squid:

------------
restart() {
	stop
	RETVAL=$?
	if [ $RETVAL -eq 0 ] ; then
		rm -rf $SQUID_PIDFILE_DIR/*
		start
	else
		echo "Failure stopping squid or stopping squid took too long. Please check before restarting."
		return 1
        fi
}
-------------

Now the restart() function, instead of blindly calling start() after stop(), it checks the return code from stop(). If stop() timeouts, it returns an error code and so restart() refuses to call start() aftewards.

This little change makes squid more robust for sysadmins who still have to use init.d scripts.

Increasing SQUID_SHUTDOWN_TIMEOUT reduces the frequence of restart failing, but doesn't guarantees success.

Comment 7 Pavel Šimerda (pavlix) 2015-03-18 15:07:36 UTC

Warning: The patch is horribly wrong, don't use it. According to our tests, it just runs "rm -rf /*".

Comment 8 Pavel Šimerda (pavlix) 2015-03-18 15:13:57 UTC

s/patch/code/

Comment 9 Fernando Lozano 2015-03-18 15:21:48 UTC

The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original / upstream script. The change I propose is simply not calling start() inside restart() if the previous stop() failed.

Comment 10 Pavel Šimerda (pavlix) 2015-03-18 15:54:42 UTC

Created attachment 1003314 [details]
new patch

Comment 11 Pavel Šimerda (pavlix) 2015-03-18 16:50:21 UTC

(In reply to Fernando Lozano from comment #9)
> The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original /
> upstream script.

As far as I see not anywhere in the git history, that means since 2004. I suspect you are using different sources than those from RHEL6.

> The change I propose is simply not calling start() inside
> restart() if the previous stop() failed.

When a patch wasn't attached to the bug report, the only way to learn the proposed changes is to compare the code which produced a patch that also added the offending line.

Anyway this is just a warning for anyone looking at this bug report to use the new patch rather than the original code.

Comment 12 Ondřej Pták 2015-04-02 11:53:43 UTC

This fix doesn't work in one specific scenario: when lockfile exists and server is not running. In this case, "service squid restart" should end up with running squid server, but because stop function fails, squid is not started.

Steps to Reproduce:
0. service squid stop
1. touch /var/lock/subsys/squid
2. service squid restart
3. service squid status

Actual results:
Stopping squid:                                            [FAILED]
Squid failed to stop in reasonable time and threfore wasn't started.
squid is stopped

Expected results:
squid (pid  3247) is running...

Notes:
Be careful not to break functionality of condrestart, which should not restart squid if there is lockfile but squid is not running.

Another thing: I'd like to propose change of error message, which is misleading, because in case of existing lockfile and stopped server, stop action fails immediately saying "...reasonable time...". It would be good to:
a) have another message for this case OR
b) change error message to be more general (not only timeout)

Comment 17 Ondřej Pták 2015-06-11 13:34:27 UTC

squid-3.1.10-22.el6_5
======================
It's possible to "keep squid alive" for reproducing this issue by attaching process to debugger. This is test with squid before rebase and it's clear that restart is not working properly. Stopping service fails, but start pass, so return value of restart is 0, although there is the same process as before restarting.

# service squid stop
Stopping squid: ..................................................
# ps aux|grep squid
root     22721  0.0  0.1  73324  3340 ?        Ss   05:39   0:00 squid -f /etc/squid/squid.conf
squid    22723  0.0  0.5  76200 10668 ?        T    05:39   0:04 (squid) -f /etc/squid/squid.conf
squid    22727  0.0  0.0  20084  1076 ?        S    05:39   0:00 (unlinkd)
root     24787  0.1  1.3 204872 26240 pts/1    S+   13:01   0:00 gdb /usr/sbin/squid 22723
root     24893  1.0  0.0 103304   884 pts/0    S+   13:07   0:00 grep squid
# service squid restart
Stopping squid: ..................................................
Starting squid:                                            [  OK  ]
# echo $?
0
# service squid status
squid (pid  22723) is running...


squid-3.1.23-9.el6
==================
# service squid start
Starting squid:                                            [  OK  ]
# ps aux|grep squid
root     25836  0.0  0.1  73972  3488 ?        Ss   13:49   0:00 squid -f /etc/squid/squid.conf
squid    25838  0.1  0.5  76412 10712 ?        S    13:49   0:00 (squid) -f /etc/squid/squid.conf
squid    25842  0.0  0.0  20084  1076 ?        S    13:49   0:00 (unlinkd)
root     25844  0.0  0.0 103304   884 pts/0    S+   13:49   0:00 grep squid
# service squid restart
Stopping squid: ..................................................
Squid failed to stop in reasonable time and therefore wasn't started.
# echo $?
1
# service squid status
squid (pid  25838) is running...


Note:
Every other tests related to init actions passed, except for condrestart/try-restart when squid is not running but lockfile exists (bug 1230753).

Comment 19 errata-xmlrpc 2015-07-22 06:28:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1314.html

Note You need to log in before you can comment on or make changes to this bug.