Bug 160704

Summary: squid child processes exit with signal 6.. squid crashes
Product: Red Hat Enterprise Linux 4 Reporter: Will Bending <will.bending>
Component: squidAssignee: Martin Stransky <stransky>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: bob, jturner, m.kuratczyk, poelstra, vasiliy.kotikov, zenczykowski
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0052 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-07 18:48:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168429    
Attachments:
Description Flags
cache.log snippet
none
squid configuration file
none
snippet from strace of xstrdup() fatal: tried to dup a null pointer in squid child
none
Full strace from xstrdup() null pointer (bzipped because of large file size).
none
output of /usr/sbin/squid -NCDd 1
none
squid.2.5.Stable11 squid.spec with updated build and config patches
none
Patch for RHEL4 is here none

Description Will Bending 2005-06-16 19:29:03 UTC
Description of problem:
squid-2.5.STABLE6-3.4E.9:7.x86_64

I upgraded squid to this version released in RHSA-2005:415-16.  Syslog starts
logging the following message repeatedly every 1 minute or so to /var/log/messages:

Squid Parent: child process 17865 exited due to signal 6

Googling that indicates it is a fatal error on the child process and everyone
says to check the cache.log for the cause.  I have been unable to find a
resolution Googling for the messages in my cache.log.

Checking the cache.log file shows squid restarting repeatedly and trying to
rebuild the cache.  Eventually the parent process dies leaving it's lock file
and pid file on the filesystem which must be deleted to restart the squid
service.  Squid *does* appear to function for about an hour after a fresh start
before the parent process dies with too many errors logged to syslog
(squid[5987]: Exiting due to repeated, frequent failures).

I checked the squid log file sizes to make sure they were not over the 2 gig
limit.. not the problem.

I purged the cache directories off disk and started squid fresh. Problem
persists immediately.

Note: *I am reporting this bug as a security level issue since this version is
intended to fix security issues with previous versions of Squid.* I had to roll
back Squid to a previous version to get my web caches working again.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. upgrade squid to version squid-2.5.STABLE6-3.4E.9:7.x86_64.
2. service squid start.
3. squid crashes repeatedly.
  
Actual results:
squid child processes exit with signal 6.. squid crashes

Expected results:
Squid runs normally and does not crash.

Additional info:
See attachment showing cache.log snippet.

Note: I am submitting this with Severity = security since this version of Squid
was released to fix security issues in previous versions.

Comment 1 Will Bending 2005-06-16 19:29:03 UTC
Created attachment 115559 [details]
cache.log snippet

Comment 2 Jay Fenlason 2005-06-16 20:37:21 UTC
Please attach your squid.conf file so I can try to reproduce the problem here. 

Comment 3 Will Bending 2005-06-16 21:43:58 UTC
Created attachment 115578 [details]
squid configuration file

Attached squid.conf as requested

Comment 4 Michal Kuratczyk 2005-06-27 10:50:48 UTC
I have the same problem on i686 (Pentium4). Using strace I found the following
error:

18082 write(2, "(squid): rfc1035.c:417: rfc1035RRUnpack: Assertion `(*off) <=
sz\' failed.\n", 74) = 74

I downgraded to 2.5.STABLE6-3.4E.5 and squid works ok now.

Comment 5 Maciej Żenczykowski 2005-06-29 13:59:48 UTC
I'm having the exact same problem.

Here's an example DNS reply packet which squid received:

"\234\37\201\200\0\1\0\1\0\0\0\0\003230\00264\00274\003217\7in-addr\4arpa\0\0\f\0\1\300\f\0\f\0\1\0\0\0\0\0\16\1o\7interia\2pl\0"

(retrieved from a strace) before quiting (assert failure and ABRT) with the
above message (also from strace, why isn't this logged to any file?)


Comment 6 Martin Stransky 2005-07-08 09:32:56 UTC
Could you please check STABLE6-3.4E.6 version? src.rpm is here:

http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.src.rpm



Comment 7 Will Bending 2005-07-11 21:08:45 UTC
I upgraded to STABLE6-3.4E.6 as requested on both my sibling caches. 
It  is not crashing every few minutes anymore, but I am seeing the
following logged to my syslog on one of the machines:

Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer!
Jul 11 15:55:21 lnxwc2 squid[17155]: Squid Parent: child process 17158
exited due to signal 6
Jul 11 15:55:24 lnxwc2 squid[17155]: Squid Parent: child process 26795
started

*Note I did not re-init my disk caches since I didn't see anything
that looked like a significant problem in the cache logs after the
upgrade.

Comment 8 Knut Rauscher 2005-07-14 10:00:48 UTC
I'm having the exact same problem from squid-2.5.STABLE3-6.3E.9.i386.rpm to
squid-2.5.STABLE3-6.3E.13.i386.rpm (RHEL3)

Comment 9 Martin Stransky 2005-07-20 07:24:09 UTC
(In reply to comment #7)
> I upgraded to STABLE6-3.4E.6 as requested on both my sibling caches. 
> It  is not crashing every few minutes anymore, but I am seeing the
> following logged to my syslog on one of the machines:
>
> Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer!
> Jul 11 15:55:21 lnxwc2 squid[17155]: Squid Parent: child process 17158
> exited due to signal 6
> Jul 11 15:55:24 lnxwc2 squid[17155]: Squid Parent: child process 26795
> started
> 
> *Note I did not re-init my disk caches since I didn't see anything
> that looked like a significant problem in the cache logs after the
> upgrade.

Could you check it with strace?

Comment 10 Martin Stransky 2005-07-20 07:25:47 UTC
A bug for assertion is here:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163052

Comment 11 Martin Stransky 2005-07-20 08:51:55 UTC
Here is a new testing package:

http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.test2.src.rpm

Could you check it? I removed two patches from STABLE6-3.4E.6...

Comment 12 Will Bending 2005-07-20 19:00:21 UTC
Created attachment 116994 [details]
snippet from strace of xstrdup() fatal: tried to dup a null pointer in squid child

This is a snippet from the strace on a squid child that aborted due to the
fatal xstrdup() call.

Comment 13 Will Bending 2005-07-20 19:17:37 UTC
Created attachment 116995 [details]
Full strace from xstrdup() null pointer (bzipped because of large file size).

Full strace of the xstrdup() fatal call.  File is big (15MB) so I have
compressed it with bzip2.

Comment 14 Will Bending 2005-07-21 20:07:38 UTC
(In reply to comment #11)
> Here is a new testing package:
> 
> http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.test2.src.rpm
> 
> Could you check it? I removed two patches from STABLE6-3.4E.6...

I rolled both boxes to this version and the xstrdup() error persists on both. 
It's not enough to crash the squid parent, but aborts the children several times
a day.

Comment 15 Martin Stransky 2005-07-22 08:43:57 UTC
Thanks for testing. Could you please check this package?

http://people.redhat.com/stransky/squid/squid-2.5.STABLE10-2.src.rpm

It's a new package with all upstreams fixes...

Comment 16 Will Bending 2005-07-23 18:15:34 UTC
Upgraded to squid-2.5.STABLE10-2.src.rpm which solves the xstrdup() problem and
squid seems stable.  I started getting reports of random access denied issues
from users.  Tested this with several pages and squid would refuse pages with
access denied by cache messages at random intervals.  Refreshing the page five
or six times would eventually render the page correctly.

Rolled back to the test2 version for now.

Comment 17 Martin Stransky 2005-08-24 13:12:13 UTC
Could you please check the original squid-2.5.STABLE6-3.4E.9:7.x86_64? Add
please "debug_options ALL,9" to /etc/squid/squid.conf, restart it and attach
/var/log/squid/cache.log file after some crashes. But be careful, this file may
be very big...

Comment 18 Will Bending 2005-08-24 20:54:18 UTC
(In reply to comment #17)
> Could you please check the original squid-2.5.STABLE6-3.4E.9:7.x86_64? Add
> please "debug_options ALL,9" to /etc/squid/squid.conf, restart it and attach
> /var/log/squid/cache.log file after some crashes. But be careful, this file may
> be very big...

Sure.  I will have to hold off for about a week however. We are right in the
middle of registration and this is the first week of classes, so I'll need to
keep things stable for now until we're past all the typical start of semester
tech issues.  I'll touch base in a week or so with some debugging information. 
Thanks --will

Comment 19 Martin Stransky 2005-08-25 08:49:51 UTC
Oh, sure :-) Thanks for your help.

Comment 20 rambler8 2005-09-08 15:56:25 UTC
I'm requesting the priority of this bug be increased to high because it is 
almost 3 months old and is now effecting our organization on an almost daily 
basis where:
1. squid crashes
2. entries in the cache.log file match those in attachment id 115559
3. no one is able to access the internet through squid until an administrator 
deletes and recreates the swap directories and restarts squid.

The problem is also not limited to X86_64 hardware, we are using i686.

Executives and admins at our orgnization are becoming increasingly annoyed due 
to the issue and lack of prompt response from RedHat and it is creating and 
unfavorable opinion of Red Hat and Red Hat Enterprise Linux within our 
organization.

Comment 21 Bob Gorman 2005-09-12 18:05:08 UTC
This sounds similar to
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=165367



Comment 22 Martin Stransky 2005-09-13 12:58:18 UTC
Could you please create and attach a strack trace when squid crashes? How to is
here - http://people.redhat.com/stransky/squid.html

Comment 23 Will Bending 2005-09-13 15:05:20 UTC
(In reply to comment #22)
> Could you please create and attach a strack trace when squid crashes? How to is
> here - http://people.redhat.com/stransky/squid.html

Martin-

I've installed squid-2.5.STABLE6-3.4E.12.dumps.src.rpm on one of my caches. 
Will advise when I get a core and stack trace.

--will

Comment 24 Martin Stransky 2005-09-13 15:37:41 UTC
Thanks. btw. I slightly updated how-to page, squid for test needs to be run as
"#/usr/sbin/squid -NCDd1", not with "service squid start". It's because the
latter perform some clean up before shutdown.

Comment 25 Will Bending 2005-09-13 16:51:35 UTC
Created attachment 118763 [details]
output of /usr/sbin/squid -NCDd 1

Comment 26 Will Bending 2005-09-13 16:54:04 UTC
(In reply to comment #24)
> Thanks. btw. I slightly updated how-to page, squid for test needs to be run as
> "#/usr/sbin/squid -NCDd1", not with "service squid start". It's because the
> latter perform some clean up before shutdown.

I'm running from the shell as described. ulimit is unlimited.  issued ulimit -c
unlimited as well.  Still not seeing cores after several tries.  Looks like
we're aborting before it gets a chance to crash.

See attachment with id=118763

Comment 27 Martin Stransky 2005-09-14 09:31:16 UTC
(In reply to comment #26)
> I'm running from the shell as described. ulimit is unlimited.  issued ulimit -c
> unlimited as well.  Still not seeing cores after several tries.  Looks like
> we're aborting before it gets a chance to crash.
> 
> See attachment with id=118763

Great, it looks like a dupe of this issue:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163052

People report some problems with the upstream patch, so I'll make a package with
this patch and publish it for testing...

Comment 28 David Bestor 2005-09-14 13:16:29 UTC
Want to make these go away?:

> Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer!

Compare this with the one in the rpm. They are not the same...
http://www.squid-cache.org/Versions/v2/2.5/bugs/squid-2.5.STABLE6-concurrent_dns_lookups.patch

Fixed mine 2 weeks ago and have not seen xstrdup error since.


Comment 29 Martin Stransky 2005-09-14 13:36:29 UTC
(In reply to comment #28)
This thread is a little crowded so you can open a new bug for this issue if you
have this problem.

Comment 30 Vasiliy Kotikov 2005-09-16 20:16:40 UTC
I have up2dated to new release of squid 3.4E.11.i386.rpm and got squid craching.
Config file and debug (ALL,9) log file (cache.log ~ 2Gb) gzipped and can be
found at http://ftp.mstuca.ru/uploads/squid/cache.log.bz2
http://ftp.mstuca.ru/uploads/squid/squid.conf


Comment 31 Martin Stransky 2005-09-21 07:49:16 UTC
(In reply to comment #30)
> I have up2dated to new release of squid 3.4E.11.i386.rpm and got squid craching.
> Config file and debug (ALL,9) log file (cache.log ~ 2Gb) gzipped and can be
> found at http://ftp.mstuca.ru/uploads/squid/cache.log.bz2
> http://ftp.mstuca.ru/uploads/squid/squid.conf

Could you provide the cache.log file generated with debug (ALL,1) too?



Comment 32 Martin Stransky 2005-09-30 08:54:25 UTC
Due to PIE gdb can't read symbols from the debug package. If your squid crashes
(and it isn't a problem with assertion) and you can't obtain a stack-trace there
are new packages which aren't compiled with PIE:

http://people.redhat.com/stransky/squid.html

Comment 33 Martin Stransky 2005-10-06 08:38:38 UTC
The new testing binaries (19.assert) are here:

http://people.redhat.com/stransky/debug/compile/

Comment 34 rambler8 2005-10-06 13:41:06 UTC
Created attachment 119670 [details]
squid.2.5.Stable11 squid.spec with updated build and config patches

IMHO, Red Hat needs to release a clean rebuild of squid rpms from the upstream
2.5.STABLE11 source. 

After being totally frustrated by Red Hat's poor response in fixing this issue,
I decided to do it myself. 
1. downloaded and installed the squid-2.5.STABLE6-3.src.rpm
2. downloaded 2.5.STABLE11 source bz2 archive to /usr/src/redhat/SOURCES
(http://www.squid-cache.org/Versions/v2/2.5/squid-2.5.STABLE11.tar.bz2)
3. Removed all back ported patch references from the squid.spec file and added
entry for the new /etc/squid/cachemgr.conf file (see attachment)
4. Rebuilt the redhat build.patch and config.patch files used to configure the
Makefiles so they would apply to the new version
5. Built the RPM

I've been running this custom package on 3 machines for a week and haven't seen
this problem and quite a few other minor issues, nor I have encountered any
incompatability issues.

If Red Hat wants to keep me and its other customers, they need to provide the
quality and support we paid for when paid for a RHEL subscription. If I'm going
to have to do this much work to diagnose and fix a buggy package, i might as
well use a free disribution and apply updates by compiling the new version of
the authors source code.

Interestingly enough, this is the same way the STABLE11-2 package for FC4 is
created except FC4 contains also contains the delay pool patch and a few other
patches targeted for STABLE12.

So how about it RedHat? Please, give us the updates and support we paid for!

Comment 35 Martin Stransky 2005-10-06 15:11:34 UTC
I'm going to propose this for RHEL-3 and RHEL-4.

Comment 36 Bob Gorman 2005-10-06 19:09:37 UTC
I would also appreciate that, as it seems that a different approach to resolving 
this issue is warranted.  I'm willing to test a release candidate under RHEL3.


Comment 37 Martin Stransky 2005-10-07 08:53:38 UTC
Okay, I'll prepare the upstream packages for testing.

Comment 38 Martin Stransky 2005-10-13 14:00:22 UTC
There are requests for update to current upstream, so you can write your comment
here. (Bug 170390, Bug 170392)

Comment 40 Martin Stransky 2005-10-18 13:49:06 UTC
Page with packages from upstream and hopefully fixed packages for RHEL3/4 is here:

http://people.redhat.com/stransky/squid/

Comment 46 Dave R. 2005-10-24 15:04:45 UTC
I am using squid-2.5.STABLE6-3.4E.11. The following URL consistently crashes
squid with the same error referred to in the summary:
http://24.141.233.85/vince%5CIMG_9065.JPG

Comment 47 Vasiliy Kotikov 2005-10-25 05:43:34 UTC
The version squid-2.5.STABLE3-6.3E.14.RC1 was rebuilt and right now is working 
without craching from the 18 of October. 
Thank You

Comment 48 Will Bending 2005-10-25 15:41:35 UTC
I too have been running squid-2.5.STABLE3-6.3E.14.RC1 on both my production
caches since 10/18/2005 with great success.  No more crashing and no more signal
6 exits.  

Dave R: Try the Release candidate packages on Martin's site.  They are working
fine for that URL you posted.

Thanks everybody :)

--will

Comment 49 Bob Gorman 2005-10-25 17:54:47 UTC
squid-2.5.STABLE3-6.3E.14.RC1 is for RHEL3.
squid-2.5.STABLE6-3.4E.11.RC1 is for RHEL4.

This bug is against RHEL4.

Are you guys running the RHEL3 version of squid under RHEL4?


Comment 50 Will Bending 2005-10-25 18:10:46 UTC
(In reply to comment #49)
> squid-2.5.STABLE3-6.3E.14.RC1 is for RHEL3.
> squid-2.5.STABLE6-3.4E.11.RC1 is for RHEL4.
> 
> This bug is against RHEL4.
> 
> Are you guys running the RHEL3 version of squid under RHEL4?
> 

I'm not, I just can't copy/paste :)
Please allow me to correct my last comment (#48).
I'm running squid-2.5.STABLE6-3.4E.11.RC1 on RHEL4 with success.  The version I
posted in comment #48 is incorrect.

Sorry for any confusion.

Comment 52 Bob Gorman 2005-10-28 16:47:27 UTC
The changelog from squid-2.5.STABLE6-3.4E.11.RC1 simply states:

- fix for #160704

What is the actual patch, Martin?



Comment 53 Martin Stransky 2005-10-31 08:00:50 UTC
Created attachment 120555 [details]
Patch for RHEL4 is here

Comment 56 Martin Stransky 2005-11-15 12:27:44 UTC
The new release-candidate packages for RHEL3/4 are available here:

http://people.redhat.com/stransky/squid/


Comment 57 Martin Stransky 2005-11-22 08:54:23 UTC
*** Bug 171169 has been marked as a duplicate of this bug. ***

Comment 63 Red Hat Bugzilla 2006-03-07 18:48:36 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0052.html


Comment 70 Issue Tracker 2007-06-19 08:57:08 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Client
Resolution set to: 'RHEL 4 U4'

This event sent from IssueTracker by uthomas 
 issue 78935