Bug 165367
Summary: | Squid dies with signal 6 and restarts and dies ... | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Bob Gorman <bob> | ||||||||
Component: | squid | Assignee: | Martin Stransky <stransky> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 3.0 | CC: | allen_armstrong, cthierer, dbarber, doug.irvine, illtud.daniel, norbert.warmuth, poelstra, vasiliy.kotikov | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i386 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | RHSA-2006-0045 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2006-03-15 15:42:52 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 168424 | ||||||||||
Attachments: |
|
Description
Bob Gorman
2005-08-08 16:41:29 UTC
Please add "debug_options ALL,9" to /etc/squid/squid.conf, restart it and attach /var/log/squid/cache.log file after some crashes. But be careful, this file may be very big. probably in this section of squid.conf: # TAG: debug_options # Logging options are set as section,level where each source file # is assigned a unique section. Lower levels result in less # output, Full debugging (level 9) can result in a very large # log file, so be careful. The magic word "ALL" sets debugging # levels for all sections. We recommend normally running with # "ALL,1". # debug_options ALL,9 I will get this info to you in two weeks. Could you please attach the squid config file? (/etc/squid/squid.conf) I ran squid-2.5.STABLE3-6.3E.13.i386.rpm with "debug_options ALL,9" for 10 minutes and it generated a 2.1G cache.log file. Although it appeared to die differently this time. /var/log/messages contains: Sep 2 11:48:38 proxy2 squid[2091]: Squid Parent: child process 2093 started Sep 2 11:57:38 proxy2 squid[2091]: Squid Parent: child process 2093 exited due to signal 25 Sep 2 11:57:41 proxy2 squid[2091]: Squid Parent: child process 2881 started Sep 2 11:57:41 proxy2 squid[2091]: Squid Parent: child process 2881 exited due to signal 25 Sep 2 11:57:44 proxy2 squid[2091]: Squid Parent: child process 2885 started Sep 2 11:57:44 proxy2 squid[2091]: Squid Parent: child process 2885 exited due to signal 25 Sep 2 11:57:47 proxy2 squid[2091]: Squid Parent: child process 2888 started Sep 2 11:57:47 proxy2 squid[2091]: Squid Parent: child process 2888 exited due to signal 25 Sep 2 11:57:50 proxy2 squid[2091]: Squid Parent: child process 2901 started Sep 2 11:57:50 proxy2 squid[2091]: Squid Parent: child process 2901 exited due to signal 25 Sep 2 11:57:53 proxy2 squid[2091]: Squid Parent: child process 2904 started Sep 2 11:57:53 proxy2 squid[2091]: Squid Parent: child process 2904 exited due to signal 25 Sep 2 11:57:53 proxy2 squid[2091]: Exiting due to repeated, frequent failures Which piece of the 2.1G cache.log file would you like? Here are some relevent entries from the cache.log file from the Original reporting of this issue. 2005/07/28 13:00:01| ctx: enter level 0: 'http://servedby.advertising.com/ site=707770/size=160600/bins=1/zpcd=/bbv_i=/bbv_ms=/bbv _noc=/bbv_o=/bnum=1577630670/' 2005/07/28 13:00:01| WARNING: found whitespace in HTTP header name {xpires=Tuesday, 27-Jul-2010 17:00:01 GMT} 2005/07/28 13:00:01| ctx: exit level 0 2005/07/28 13:00:01| httpReadReply: Excess data from "GET http://servedby. advertising.com/site=707770/size=160600/bins=1/zpcd=/bbv _i=/bbv_ms=/bbv_noc=/bbv_o=/bnum=1577630670/" 2005/07/28 13:00:10| Store rebuilding is 21.8% complete 2005/07/28 13:00:18| ctx: enter level 0: 'http://servedby.advertising.com/ site=695368/size=728090/bins=1/zpcd=/bbv_i=/bbv_ms=/bbv _noc=/bbv_o=/bnum=771829576/' 2005/07/28 13:00:18| WARNING: found whitespace in HTTP header name {pires=Tuesday, 27-Jul-2010 17:00:18 GMT} 2005/07/28 13:00:18| ctx: exit level 0 2005/07/28 13:00:18| httpReadReply: Excess data from "GET http://servedby. advertising.com/site=695368/size=728090/bins=1/zpcd=/bbv _i=/bbv_ms=/bbv_noc=/bbv_o=/bnum=771829576/" 2005/07/28 13:00:25| Store rebuilding is 42.2% complete 2005/07/28 13:00:35| WARNING: newer swaplog entry for dirno 1, fileno 0000011B 2005/07/28 13:00:37| ctx: enter level 0: 'http://servedby.advertising.com/ site=695368/size=728090/bins=1/zpcd=/bbv_i=/bbv_ms=/bbv _noc=/bbv_o=/bnum=646692015/' 2005/07/28 13:00:37| WARNING: found whitespace in HTTP header name {pires=Tuesday, 27-Jul-2010 17:00:37 GMT} 2005/07/28 13:00:37| ctx: exit level 0 2005/07/28 13:00:37| httpReadReply: Excess data from "GET http://servedby. advertising.com/site=695368/size=728090/bins=1/zpcd=/bbv _i=/bbv_ms=/bbv_noc=/bbv_o=/bnum=646692015/" ... 2005/07/28 13:02:59| ctx: enter level 0: 'http://servedby.advertising.com/ site=695349/size=120090/bnum=52939741/optn=16/zpcd=/bbv _i=/bbv_ms=/bbv_noc=/bbv_o=/ctrt=4?http://ar.atwola.com/redir/B0/- yifq8WIyhJOcILzHE3iEZEetnaCJ732SMpRX-mtJ_ssOxsoX4abjQ$$/' 2005/07/28 13:02:59| WARNING: found whitespace in HTTP header name {Z4ot5yiLCto1iTmtSZszti4CwWuFrBXkw/DTQq8k8zY9D/nndno+z1xOspukMX J!; domain=.advertising.com; path=/; expires=Tuesday, 27-Jul-2010 17:02:59 GMT} 2005/07/28 13:02:59| ctx: exit level 0 2005/07/28 13:02:59| httpReadReply: Excess data from "GET http://servedby. advertising.com/site=695349/size=120090/bnum=52939741/op tn=16/zpcd=/bbv_i=/bbv_ms=/bbv_noc=/bbv_o=/ctrt=4?http://ar.atwola.com/redir/B0/ -yifq8WIyhJOcILzHE3iEZEetnaCJ732SMpRX-mtJ_ssOxsoX4 abjQ$$/" Hi, I came across this Bugzilla as I am having problems as well. I recently went through the archives of the squid-users list as well as posting. I have been able to come to the conclusion that a recent patch to Redhat 3/4 AS/ES seems to be causing it to crash. As far as I can tell it is happening by people on various kinds of machines. Created attachment 118516 [details]
junk
squid.conf as requested.
Comment on attachment 118516 [details]
junk
Wrong file sent.
Comment on attachment 118516 [details]
junk
junk
Created attachment 118518 [details]
squid.conf
squid.conf as requested.
This sounds similar to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=160704 Could you please create and attach a strack trace when squid crashes? How to is here - http://people.redhat.com/stransky/squid.html Before I go through the effort, again, of reconfiguring and crashing a production server for diagnostic purposes, is there any other information I should gather at that time? Also, you asked for a debug_options_all_9 log dump. What should I do with this file? Do you really want it? To get a core file for a stack trace I run squid from the command line (bash) like this, correct?: ulimit -c unlimited; /usr/sbin/squid -NCd1 I read the Squid Debug (stack trace) document you referenced. I prefer to not compile and install a different version of squid (squid-2.5.STABLE3-6.3E.15) as I believe that getting the stack trace for the specific RPM in question (squid- 2.5.STABLE3-6.3E.13) is preferable, assuming it provides a suitable core file. Is that not okay? If I'm going to compile and install a custom, non-stock RedHat RPM, I'll just get a current version (eg: squid-2.5.STABLE10) and run with that. It can't be any worse and may just solve the problem. Your thoughts? I await your response. And, Thanks! The log dump file (generated with debug_options_all_9) can help me explore where is a problem, so if you can't or don't want to create a stack trace, you can send me the log dump (and you could send me both, of course) As for running squid (/usr/sbin/squid -NCd1), you're right, I'll modify the how-to to be more clear. Version that I have on my pages (squid-2.5.STABLE3-6.3E.15) contains a patch for generating a coredump files, because linux doesn't create a coredump for SUIDed process by default, so you can't obtain it from current packages which are shipped with RHEL. Of course, you can do it, but it's more complicated, you have to run squid as user squid, not root, so you have to edit /etc/passwd (add shell for the squid user), login as user squid and run squid server. And probably you have to add the -D options) Package squid-2.5.STABLE3-6.3E.15 contains additional fixes that are heading to RHEL-3/4 via security advisory (they're the latest security issue for STABLE10). Finally, if you are satisfied with current stable upstream version (STABLE10 - from fedora or the upstream one), you can use it and close this report. According to RH politics I can't upgrade whole package in RHEL, so I have to patch only reported bugs and lots of old minor bugs are unfixed. Please do not close this report until a solution has been provided for RHEL3. There are multiple customers seeing this behavior and using fedora/upstream packages is not an acceptable solution on the whole. Since you say I can't get a core file from a pre-packaged stock rpm, would getting a stack trace with this procedure be suitable?: # http://www.squid-cache.org/Doc/FAQ/FAQ-11.html#coredumps % gdb /path/to/squid handle SIGPIPE pass nostop noprint run -DNYCd3 [wait for crash] backtrace quit I can get you part of that log file, if I know what part you want, but I have no facility to deliver a 2.1GB file. Can you provide an FTP site to try and upload to? I may well wait for the next official RHEL3 RPM release and try that, which sounds like it will be soon based upon this: http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=167413 Is there an estimated timeframe you can provide? Here is another user with the issue: http://www.mail-archive.com/squid-users@squid-cache.org/msg31884.html (In reply to comment #22) > Here is another user with the issue: > http://www.mail-archive.com/squid-users@squid-cache.org/msg31884.html > I know about it, I'm subscribed to squid-users and squid-devel. I think it's a dupe of some current open bugs. (In reply to comment #21) > Since you say I can't get a core file from a pre-packaged stock rpm, would > getting a stack trace with this procedure be suitable?: > > # http://www.squid-cache.org/Doc/FAQ/FAQ-11.html#coredumps > % gdb /path/to/squid > handle SIGPIPE pass nostop noprint > run -DNYCd3 > [wait for crash] > backtrace > quit Sure, you can use it if it's better for you. Generally, you can use everything from http://www.squid-cache.org/Doc/FAQ/FAQ-11.html#coredumps if it works for you. > I can get you part of that log file, if I know what part you want, but I have no > facility to deliver a 2.1GB file. Can you provide an FTP site to try and upload > to? Could you compress this file and tell me size of it? I'll look for some place. > I may well wait for the next official RHEL3 RPM release and try that, which > sounds like it will be soon based upon this: > http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=167413 > Is there an estimated timeframe you can provide? Packages, which are planned for this issue are on my page - http://people.redhat.com/stransky/squid.html. But I'm almost sure they don't fix this issue. At 03:45 AM 9/16/2005, Martin Stransky wrote: > thanks for this file, I got it right so you can remove it from your site. > Unfortunately, you have very loaded site so a crash isn't in this file > (you can find it using string "Starting" when new squid process starts > after crash). I'm afraid I need the stack-trace :-( There should be some clues in there. I started a fresh log file, then started squid, then redirected users to use that server, waited until squid died (as indicated in /var/log/messages), redirected users away from the server, stopped squid, rotated the log, saved the log. Squid ran for about 10 minutes. Below are the entries from /var/log/messages. Perhaps the timestamps can help you pinpoint the issue better. I did notice that these messages are different (See comment #5). Perhaps the problem has to do with log entries, and by changing the logging options we have move the problem around. I will endeavor to get a stack trace. There is a testing package for the assert issue: http://people.redhat.com/stransky/debug/squid-2.5.STABLE3-6.3E.16.assert.src.rpm Please let me know if it works or not. Due to PIE gdb can't read symbols from the debug package. If your squid crashes (and it isn't a problem with assertion) and you can't obtain a stack-trace there are new packages which aren't compiled with PIE: http://people.redhat.com/stransky/squid.html Binaries are also available at: http://people.redhat.com/stransky/debug/compile/ The new testing binaries (17.assert) are here: http://people.redhat.com/stransky/debug/compile/ They should provide correct back-trace. The new testing binaries (19.assert) are here: http://people.redhat.com/stransky/debug/compile/ Created attachment 119676 [details]
squid-2.5.S3-6.3E.19a_logs.tgz
Ok, I tested squid-2.5.STABLE3-6.3E.19.assert.i386.rpm.
Squid still dies, only took about 10 minutes. Please find attachment
squid-2.5.S3-6.3E.19a_logs.tgz which contains:
/var/log/messages
/var/log/squid/cache.log
/var/log/squid/access.log
Thanks. I added one new fix to rfc1035.c from latest upstream to RHEL-3 packages and they are here (20.assert): http://people.redhat.com/stransky/debug/compile/ RFC 1053 refers to the "Telnet X.3 PAD option". This is not an issue for me. I will now await a test release as mentioned in this thread: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=160704#c34 http://www.faqs.org/rfcs/rfc1035.html - DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION Squid has a problem with parsing DNS requests (and asserts here), but there will be other problems, of course... *** Bug 162025 has been marked as a duplicate of this bug. *** *** Bug 116337 has been marked as a duplicate of this bug. *** Excuse the RFC1035 to RFC1053 dislexia. Will there be a new compiled version for testing as referenced here: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=160704#c37 It looks like RH managers don't want to release completely new packages for RHEL3/4, however I opened two bugs for it (Bug 170390, Bug 170392), so you can write your requests there. And testing packages will be here after weekend. I don't see how an update from squid-2.5.3 to squid-2.5.11 is considered as 'completely new'. When I look in the changelogs I see small changes, such as cosmetics, minor bugs and security issues, not large functional changes. Have these RH managers reviewed the changelogs? Now if there was an upgrade from squid-2.5 to squid-3.0 then 'completely new' would make more sense. Perhaps I am missing something. Please reconsider your posistion. The software you are shipping is not functional. Why pay for a RedHat subscription? Page with packages from upstream and hopefully fixed packages for RHEL3/4 is here: http://people.redhat.com/stransky/squid/ The new release-candidate packages for RHEL3/4 are available here: http://people.redhat.com/stransky/squid/ *** Bug 163052 has been marked as a duplicate of this bug. *** *** Bug 178055 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0045.html This is a RHEL3 bug. RHEL3 is still supported. So, it seems that saying RHEL4 solves a RHEL3 bug is missing the point. ;-) the update #91 is a note of a internal ticket system, which was only linked to this bug and should have not appeared here (in fact I have marked it as private again). This particular bug should have been fixed in the errata mentioned in comment #88 (which is a update for RHEL3), please check if it solves your problem as well - otherwise please contact Red Hat Technical Support. |