Hide Forgot
Description of problem: Sometimes restarting squid leaves old processes running for a short while at the same times as new processes, and so the new processes won't start fine, forcing me to killall squid and start anew. Version-Release number of selected component (if applicable): 3.3.x/3.4.x How reproducible: You have to be (un)lucky. The time to stop squid varies a lot, and the problem only happens if it takes longer than SQUID_SHUTDOWN_TIMEOUT defined in /etc/sysconfig/squid Steps to Reproduce: 1. service squid restart Actual results: squid won't proxy http requests Expected results: squid working as usual Additional info: I tested this in CentOS 6.5 and using an unofficial RPM package from squid-cache.org. But I checked the squid RPM packates for Fedora and CentOS and the init.d script is the same, so the problem should affect RHEL6. I think it's better having service squid restart failling than leaving a broken squid, so I propose the following change to /etc/init.d/squid: ------------ restart() { stop RETVAL=$? if [ $RETVAL -eq 0 ] ; then rm -rf $SQUID_PIDFILE_DIR/* start else echo "Failure stopping squid or stopping squid took too long. Please check before restarting." return 1 fi } ------------- Now the restart() function, instead of blindly calling start() after stop(), it checks the return code from stop(). If stop() timeouts, it returns an error code and so restart() refuses to call start() aftewards. This little change makes squid more robust for sysadmins who still have to use init.d scripts. Increasing SQUID_SHUTDOWN_TIMEOUT reduces the frequence of restart failing, but doesn't guarantees success.
Warning: The patch is horribly wrong, don't use it. According to our tests, it just runs "rm -rf /*".
s/patch/code/
The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original / upstream script. The change I propose is simply not calling start() inside restart() if the previous stop() failed.
Created attachment 1003314 [details] new patch
(In reply to Fernando Lozano from comment #9) > The "rm -rf $SQUID_PIDFILE_DIR/*" was already part if the original / > upstream script. As far as I see not anywhere in the git history, that means since 2004. I suspect you are using different sources than those from RHEL6. > The change I propose is simply not calling start() inside > restart() if the previous stop() failed. When a patch wasn't attached to the bug report, the only way to learn the proposed changes is to compare the code which produced a patch that also added the offending line. Anyway this is just a warning for anyone looking at this bug report to use the new patch rather than the original code.
This fix doesn't work in one specific scenario: when lockfile exists and server is not running. In this case, "service squid restart" should end up with running squid server, but because stop function fails, squid is not started. Steps to Reproduce: 0. service squid stop 1. touch /var/lock/subsys/squid 2. service squid restart 3. service squid status Actual results: Stopping squid: [FAILED] Squid failed to stop in reasonable time and threfore wasn't started. squid is stopped Expected results: squid (pid 3247) is running... Notes: Be careful not to break functionality of condrestart, which should not restart squid if there is lockfile but squid is not running. Another thing: I'd like to propose change of error message, which is misleading, because in case of existing lockfile and stopped server, stop action fails immediately saying "...reasonable time...". It would be good to: a) have another message for this case OR b) change error message to be more general (not only timeout)
squid-3.1.10-22.el6_5 ====================== It's possible to "keep squid alive" for reproducing this issue by attaching process to debugger. This is test with squid before rebase and it's clear that restart is not working properly. Stopping service fails, but start pass, so return value of restart is 0, although there is the same process as before restarting. # service squid stop Stopping squid: .................................................. # ps aux|grep squid root 22721 0.0 0.1 73324 3340 ? Ss 05:39 0:00 squid -f /etc/squid/squid.conf squid 22723 0.0 0.5 76200 10668 ? T 05:39 0:04 (squid) -f /etc/squid/squid.conf squid 22727 0.0 0.0 20084 1076 ? S 05:39 0:00 (unlinkd) root 24787 0.1 1.3 204872 26240 pts/1 S+ 13:01 0:00 gdb /usr/sbin/squid 22723 root 24893 1.0 0.0 103304 884 pts/0 S+ 13:07 0:00 grep squid # service squid restart Stopping squid: .................................................. Starting squid: [ OK ] # echo $? 0 # service squid status squid (pid 22723) is running... squid-3.1.23-9.el6 ================== # service squid start Starting squid: [ OK ] # ps aux|grep squid root 25836 0.0 0.1 73972 3488 ? Ss 13:49 0:00 squid -f /etc/squid/squid.conf squid 25838 0.1 0.5 76412 10712 ? S 13:49 0:00 (squid) -f /etc/squid/squid.conf squid 25842 0.0 0.0 20084 1076 ? S 13:49 0:00 (unlinkd) root 25844 0.0 0.0 103304 884 pts/0 S+ 13:49 0:00 grep squid # service squid restart Stopping squid: .................................................. Squid failed to stop in reasonable time and therefore wasn't started. # echo $? 1 # service squid status squid (pid 25838) is running... Note: Every other tests related to init actions passed, except for condrestart/try-restart when squid is not running but lockfile exists (bug 1230753).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1314.html