Bug 160704
Summary: | squid child processes exit with signal 6.. squid crashes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Will Bending <will.bending> |
Component: | squid | Assignee: | Martin Stransky <stransky> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | bob, jturner, m.kuratczyk, poelstra, vasiliy.kotikov, zenczykowski |
Target Milestone: | --- | Keywords: | Security |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHSA-2006-0052 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-03-07 18:48:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 168429 | ||
Attachments: |
Description
Will Bending
2005-06-16 19:29:03 UTC
Created attachment 115559 [details]
cache.log snippet
Please attach your squid.conf file so I can try to reproduce the problem here. Created attachment 115578 [details]
squid configuration file
Attached squid.conf as requested
I have the same problem on i686 (Pentium4). Using strace I found the following error: 18082 write(2, "(squid): rfc1035.c:417: rfc1035RRUnpack: Assertion `(*off) <= sz\' failed.\n", 74) = 74 I downgraded to 2.5.STABLE6-3.4E.5 and squid works ok now. I'm having the exact same problem. Here's an example DNS reply packet which squid received: "\234\37\201\200\0\1\0\1\0\0\0\0\003230\00264\00274\003217\7in-addr\4arpa\0\0\f\0\1\300\f\0\f\0\1\0\0\0\0\0\16\1o\7interia\2pl\0" (retrieved from a strace) before quiting (assert failure and ABRT) with the above message (also from strace, why isn't this logged to any file?) Could you please check STABLE6-3.4E.6 version? src.rpm is here: http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.src.rpm I upgraded to STABLE6-3.4E.6 as requested on both my sibling caches. It is not crashing every few minutes anymore, but I am seeing the following logged to my syslog on one of the machines: Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer! Jul 11 15:55:21 lnxwc2 squid[17155]: Squid Parent: child process 17158 exited due to signal 6 Jul 11 15:55:24 lnxwc2 squid[17155]: Squid Parent: child process 26795 started *Note I did not re-init my disk caches since I didn't see anything that looked like a significant problem in the cache logs after the upgrade. I'm having the exact same problem from squid-2.5.STABLE3-6.3E.9.i386.rpm to squid-2.5.STABLE3-6.3E.13.i386.rpm (RHEL3) (In reply to comment #7) > I upgraded to STABLE6-3.4E.6 as requested on both my sibling caches. > It is not crashing every few minutes anymore, but I am seeing the > following logged to my syslog on one of the machines: > > Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer! > Jul 11 15:55:21 lnxwc2 squid[17155]: Squid Parent: child process 17158 > exited due to signal 6 > Jul 11 15:55:24 lnxwc2 squid[17155]: Squid Parent: child process 26795 > started > > *Note I did not re-init my disk caches since I didn't see anything > that looked like a significant problem in the cache logs after the > upgrade. Could you check it with strace? A bug for assertion is here: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163052 Here is a new testing package: http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.test2.src.rpm Could you check it? I removed two patches from STABLE6-3.4E.6... Created attachment 116994 [details]
snippet from strace of xstrdup() fatal: tried to dup a null pointer in squid child
This is a snippet from the strace on a squid child that aborted due to the
fatal xstrdup() call.
Created attachment 116995 [details]
Full strace from xstrdup() null pointer (bzipped because of large file size).
Full strace of the xstrdup() fatal call. File is big (15MB) so I have
compressed it with bzip2.
(In reply to comment #11) > Here is a new testing package: > > http://people.redhat.com/stransky/squid/squid-2.5.STABLE6-3.4E.6.test2.src.rpm > > Could you check it? I removed two patches from STABLE6-3.4E.6... I rolled both boxes to this version and the xstrdup() error persists on both. It's not enough to crash the squid parent, but aborts the children several times a day. Thanks for testing. Could you please check this package? http://people.redhat.com/stransky/squid/squid-2.5.STABLE10-2.src.rpm It's a new package with all upstreams fixes... Upgraded to squid-2.5.STABLE10-2.src.rpm which solves the xstrdup() problem and squid seems stable. I started getting reports of random access denied issues from users. Tested this with several pages and squid would refuse pages with access denied by cache messages at random intervals. Refreshing the page five or six times would eventually render the page correctly. Rolled back to the test2 version for now. Could you please check the original squid-2.5.STABLE6-3.4E.9:7.x86_64? Add please "debug_options ALL,9" to /etc/squid/squid.conf, restart it and attach /var/log/squid/cache.log file after some crashes. But be careful, this file may be very big... (In reply to comment #17) > Could you please check the original squid-2.5.STABLE6-3.4E.9:7.x86_64? Add > please "debug_options ALL,9" to /etc/squid/squid.conf, restart it and attach > /var/log/squid/cache.log file after some crashes. But be careful, this file may > be very big... Sure. I will have to hold off for about a week however. We are right in the middle of registration and this is the first week of classes, so I'll need to keep things stable for now until we're past all the typical start of semester tech issues. I'll touch base in a week or so with some debugging information. Thanks --will Oh, sure :-) Thanks for your help. I'm requesting the priority of this bug be increased to high because it is almost 3 months old and is now effecting our organization on an almost daily basis where: 1. squid crashes 2. entries in the cache.log file match those in attachment id 115559 3. no one is able to access the internet through squid until an administrator deletes and recreates the swap directories and restarts squid. The problem is also not limited to X86_64 hardware, we are using i686. Executives and admins at our orgnization are becoming increasingly annoyed due to the issue and lack of prompt response from RedHat and it is creating and unfavorable opinion of Red Hat and Red Hat Enterprise Linux within our organization. This sounds similar to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=165367 Could you please create and attach a strack trace when squid crashes? How to is here - http://people.redhat.com/stransky/squid.html (In reply to comment #22) > Could you please create and attach a strack trace when squid crashes? How to is > here - http://people.redhat.com/stransky/squid.html Martin- I've installed squid-2.5.STABLE6-3.4E.12.dumps.src.rpm on one of my caches. Will advise when I get a core and stack trace. --will Thanks. btw. I slightly updated how-to page, squid for test needs to be run as "#/usr/sbin/squid -NCDd1", not with "service squid start". It's because the latter perform some clean up before shutdown. Created attachment 118763 [details]
output of /usr/sbin/squid -NCDd 1
(In reply to comment #24) > Thanks. btw. I slightly updated how-to page, squid for test needs to be run as > "#/usr/sbin/squid -NCDd1", not with "service squid start". It's because the > latter perform some clean up before shutdown. I'm running from the shell as described. ulimit is unlimited. issued ulimit -c unlimited as well. Still not seeing cores after several tries. Looks like we're aborting before it gets a chance to crash. See attachment with id=118763 (In reply to comment #26) > I'm running from the shell as described. ulimit is unlimited. issued ulimit -c > unlimited as well. Still not seeing cores after several tries. Looks like > we're aborting before it gets a chance to crash. > > See attachment with id=118763 Great, it looks like a dupe of this issue: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163052 People report some problems with the upstream patch, so I'll make a package with this patch and publish it for testing... Want to make these go away?: > Jul 11 15:55:19 lnxwc2 (squid): xstrdup: tried to dup a NULL pointer! Compare this with the one in the rpm. They are not the same... http://www.squid-cache.org/Versions/v2/2.5/bugs/squid-2.5.STABLE6-concurrent_dns_lookups.patch Fixed mine 2 weeks ago and have not seen xstrdup error since. (In reply to comment #28) This thread is a little crowded so you can open a new bug for this issue if you have this problem. I have up2dated to new release of squid 3.4E.11.i386.rpm and got squid craching. Config file and debug (ALL,9) log file (cache.log ~ 2Gb) gzipped and can be found at http://ftp.mstuca.ru/uploads/squid/cache.log.bz2 http://ftp.mstuca.ru/uploads/squid/squid.conf (In reply to comment #30) > I have up2dated to new release of squid 3.4E.11.i386.rpm and got squid craching. > Config file and debug (ALL,9) log file (cache.log ~ 2Gb) gzipped and can be > found at http://ftp.mstuca.ru/uploads/squid/cache.log.bz2 > http://ftp.mstuca.ru/uploads/squid/squid.conf Could you provide the cache.log file generated with debug (ALL,1) too? Due to PIE gdb can't read symbols from the debug package. If your squid crashes (and it isn't a problem with assertion) and you can't obtain a stack-trace there are new packages which aren't compiled with PIE: http://people.redhat.com/stransky/squid.html The new testing binaries (19.assert) are here: http://people.redhat.com/stransky/debug/compile/ Created attachment 119670 [details] squid.2.5.Stable11 squid.spec with updated build and config patches IMHO, Red Hat needs to release a clean rebuild of squid rpms from the upstream 2.5.STABLE11 source. After being totally frustrated by Red Hat's poor response in fixing this issue, I decided to do it myself. 1. downloaded and installed the squid-2.5.STABLE6-3.src.rpm 2. downloaded 2.5.STABLE11 source bz2 archive to /usr/src/redhat/SOURCES (http://www.squid-cache.org/Versions/v2/2.5/squid-2.5.STABLE11.tar.bz2) 3. Removed all back ported patch references from the squid.spec file and added entry for the new /etc/squid/cachemgr.conf file (see attachment) 4. Rebuilt the redhat build.patch and config.patch files used to configure the Makefiles so they would apply to the new version 5. Built the RPM I've been running this custom package on 3 machines for a week and haven't seen this problem and quite a few other minor issues, nor I have encountered any incompatability issues. If Red Hat wants to keep me and its other customers, they need to provide the quality and support we paid for when paid for a RHEL subscription. If I'm going to have to do this much work to diagnose and fix a buggy package, i might as well use a free disribution and apply updates by compiling the new version of the authors source code. Interestingly enough, this is the same way the STABLE11-2 package for FC4 is created except FC4 contains also contains the delay pool patch and a few other patches targeted for STABLE12. So how about it RedHat? Please, give us the updates and support we paid for! I'm going to propose this for RHEL-3 and RHEL-4. I would also appreciate that, as it seems that a different approach to resolving this issue is warranted. I'm willing to test a release candidate under RHEL3. Okay, I'll prepare the upstream packages for testing. There are requests for update to current upstream, so you can write your comment here. (Bug 170390, Bug 170392) Page with packages from upstream and hopefully fixed packages for RHEL3/4 is here: http://people.redhat.com/stransky/squid/ I am using squid-2.5.STABLE6-3.4E.11. The following URL consistently crashes squid with the same error referred to in the summary: http://24.141.233.85/vince%5CIMG_9065.JPG The version squid-2.5.STABLE3-6.3E.14.RC1 was rebuilt and right now is working without craching from the 18 of October. Thank You I too have been running squid-2.5.STABLE3-6.3E.14.RC1 on both my production caches since 10/18/2005 with great success. No more crashing and no more signal 6 exits. Dave R: Try the Release candidate packages on Martin's site. They are working fine for that URL you posted. Thanks everybody :) --will squid-2.5.STABLE3-6.3E.14.RC1 is for RHEL3. squid-2.5.STABLE6-3.4E.11.RC1 is for RHEL4. This bug is against RHEL4. Are you guys running the RHEL3 version of squid under RHEL4? (In reply to comment #49) > squid-2.5.STABLE3-6.3E.14.RC1 is for RHEL3. > squid-2.5.STABLE6-3.4E.11.RC1 is for RHEL4. > > This bug is against RHEL4. > > Are you guys running the RHEL3 version of squid under RHEL4? > I'm not, I just can't copy/paste :) Please allow me to correct my last comment (#48). I'm running squid-2.5.STABLE6-3.4E.11.RC1 on RHEL4 with success. The version I posted in comment #48 is incorrect. Sorry for any confusion. The changelog from squid-2.5.STABLE6-3.4E.11.RC1 simply states: - fix for #160704 What is the actual patch, Martin? Created attachment 120555 [details]
Patch for RHEL4 is here
The new release-candidate packages for RHEL3/4 are available here: http://people.redhat.com/stransky/squid/ *** Bug 171169 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0052.html Internal Status set to 'Resolved' Status set to: Closed by Client Resolution set to: 'RHEL 4 U4' This event sent from IssueTracker by uthomas issue 78935 |