Bug 163052

Summary: rfc1035.c:417: rfc1035RRUnpack: Assertion `(*off) <= sz\' failed.
Product: Red Hat Enterprise Linux 3 Reporter: Norbert Warmuth <norbert.warmuth>
Component: squidAssignee: Martin Stransky <stransky>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: bob, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-11-22 08:47:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 161600, 171169    

Description Norbert Warmuth 2005-07-12 15:49:04 UTC
Description of problem:
Since update from 7:2.5.STABLE3-6.3E.9 to 7:2.5.STABLE3-6.3E.13
squid restarts infrequently (approximatly 50 times since June 27) 
with logging following message to /var/log/messages:

    Squid Parent: child process 21560 exited due to signal 6


strace(1) revealed:
[pid 21557] read(10, "GET http://82.135.27.162/bombeiros/ HTTP/1.0\r\nUser-Agent: w3m/0.5.1\r\nAccept: text/*, image/*, */*\r\nAccept-Encoding: gzip, compress, bzip, bzip2, deflate\r\nAccept-Language: en; q=1.0\r\nHost: 82.135.27.162\r\nPragma: no-cache\r\nCache-control: no-cache\r\n\r\n", 4095) = 248
[pid 21557] sendto(4, "\247l\1\0\0\1\0\0\0\0\0\0\003162\00227\003135\00282\7in-addr\4arpa\0\0\f\0\1", 44, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.10.2")}, 16) = 44 
[pid 21557] sendto(4, "u6\1\0\0\1\0\0\0\0\0\0\003162\00227\003135\00282\7in-addr\4arpa\0\0\f\0\1", 44, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.10.2")}, 16) = 44
[...]
[pid 21557] recvfrom(4, "\247l\201\200\0\1\0\1\0\2\0\0\003162\00227\003135\00282\7in-addr\4arpa\0\0\f\0\1\300\f\0\f\0\1\0\0\34\24\0*\22host-82-135-27-162\10customer\10m-online\3net\0\300\20\0\2\0\1\0\1pw\0\6\3ns2\300T\300\20\0\2\0\1\0\1pw\0\6\3ns1\300T", 512, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.10.2")}, [16]) = 134
[pid 21557] write(2, "(squid): rfc1035.c:417: rfc1035RRUnpack: Assertion `(*off) <= sz\' failed.\n", 74) = 74


After 'rndc flush' on 192.168.10.2 the requests succeeded and I haven't been 
able to reproduce it again with http://82.135.27.162/bombeiros/.


Version-Release number of selected component (if applicable):
squid-2.5.STABLE3-6.3E.13

Comment 1 Martin Stransky 2005-07-13 10:31:00 UTC

*** This bug has been marked as a duplicate of 163068 ***

Comment 2 Norbert Warmuth 2005-07-14 08:18:44 UTC
I am not able to view bug #163068 because I get an "not authorized to 
access" error message.

Is there a workaround besides going back to squid 2.5.STABLE3-6.3E.9?

Whats the release time frame for a squid package without this bug?




Comment 3 Martin Stransky 2005-07-14 08:24:07 UTC
Here is some info (Compile with --disable-internal-dns):

http://www.squid-cache.org/Versions/v2/2.5/bugs/#squid-2.5.STABLE5-rfc1035NameUnpack

Comment 4 Norbert Warmuth 2005-07-14 15:06:58 UTC
Thanks. I've rebuilt squid-2.5.STABLE3-6.3E.13 with
squid-2.5.STABLE5-rfc1035NameUnpack.patch applied and will test it later.

FYI. The RHEL3 SPEC-File lacks two BuildRequires: openssl-devel and
cyrus-sasl-devel.



Comment 5 Martin Stransky 2005-07-18 09:30:11 UTC
Ok, thanks.

Comment 6 DMZGlobal LTD 2005-07-26 04:43:05 UTC
How did your testing get on Norbert? I am experiencing a similar problem with
squid exiting due to a signal 6 1-2 times a day. We are running build
squid-2.5.STABLE3-6.3E.13 also.

Martin, I see the squid website is offering STABLE10, when will the Redhat EL3
packages reach this version? I see there a are a large number of bugfixes
between these versions.

Cheers

Comment 7 Martin Stransky 2005-07-26 08:26:35 UTC
(In reply to comment #6)
> Martin, I see the squid website is offering STABLE10, when will the Redhat EL3
> packages reach this version? I see there a are a large number of bugfixes
> between these versions.

I want to push through as many fixes as possible (into U7), but we have to be
very careful with the new version, it must work on RHEL3 without breaking
compatibility with the old version.


Comment 8 Norbert Warmuth 2005-07-26 14:14:42 UTC
(In reply to comment 6)
> How did your testing get on Norbert?

squid-2.5.STABLE3-6.3E.13 with squid-2.5.STABLE5-rfc1035NameUnpack.patch from 
upstream (see comment 3) applied still dies with signal 6. I have not tested
squid compiled with --disable-internal-dns. 

Since downgrade to squid-2.5.STABLE3-6.3E.9 it runs stable again.


(In reply to comment 7)
> I want to push through as many fixes as possible (into U7).

You mean U6, don't you. RHEL3 U5 is current and U7 would mean we have to wait
6 month or longer for a fix. 



Comment 9 Martin Stransky 2005-09-14 09:39:34 UTC
Unfortunately I meant U7, squid isn't planned for U6, AFAIK. RH has some limit
to number of packages and squid doesn't went trought it. btw. Bugzilla is only a
bug-tracking system, if you want to press some bug, you have to use the
issue-tracker system and file your request there.

Comment 11 Martin Stransky 2005-09-29 08:28:14 UTC
There is a testing package for this issue:

http://people.redhat.com/stransky/debug/squid-2.5.STABLE3-6.3E.16.assert.src.rpm

Please let me know if it works or not.

Comment 14 Bastien Nocera 2005-09-30 14:40:57 UTC
Binaries are also available at:
http://people.redhat.com/stransky/debug/compile/

Comment 20 Martin Stransky 2005-10-03 12:30:35 UTC
The new testing binaries (17.assert) are here:

http://people.redhat.com/stransky/debug/compile/

They should provide correct back-trace.

Comment 27 Norbert Warmuth 2005-10-05 15:28:47 UTC
17.assert does not work and failed again with "(squid): rfc1035.c:417: 
rfc1035RRUnpack: Assertion `(*off) <= sz\' failed.":

$ rpm -qf /usr/sbin/squid
squid-2.5.STABLE3-6.3E.17.assert
$ gdb /usr/sbin/squid core.19720
[...]
Loaded symbols for /lib/libnss_files.so.2
#0  0x009c8cdf in raise () from /lib/tls/libc.so.6
(gdb) bt
#0  0x009c8cdf in raise () from /lib/tls/libc.so.6
#1  0x009ca4e5 in abort () from /lib/tls/libc.so.6
#2  0x009c2609 in __assert_fail () from /lib/tls/libc.so.6
#3  0x080d0946 in rfc1035RRUnpack (buf=0xbfff8758 "\211", sz=134, 
    off=0xbfff8758, RR=0xcd12a30) at rfc1035.c:407
#4  0x080d0d9d in rfc1035MessageUnpack (buf=0x811c280 "C\201\200", sz=134, 
    answer=0xbfff8788) at rfc1035.c:607
#5  0x0806d61f in idnsGrokReply (buf=0x0, sz=0) at dns_internal.c:492
#6  0x0806d9cf in idnsRead (fd=4, data=0x0) at dns_internal.c:607
#7  0x08068c40 in comm_check_incoming_poll_handlers (nfds=1, fds=0xbfff8c70)
    at comm_select.c:238
#8  0x080697fc in comm_poll_dns_incoming () at comm_select.c:927
#9  0x080692f6 in comm_poll (msec=47) at comm_select.c:497
#10 0x0808ecd3 in main (argc=0, argv=0x2) at main.c:743
#11 0x009b678a in __libc_start_main () from /lib/tls/libc.so.6
#12 0x0804b701 in _start ()
(gdb) p *off
No symbol "off" in current context.
(gdb) up
#1  0x009ca4e5 in abort () from /lib/tls/libc.so.6
(gdb) 
#2  0x009c2609 in __assert_fail () from /lib/tls/libc.so.6
(gdb) 
#3  0x080d0946 in rfc1035RRUnpack (buf=0xbfff8758 "\211", sz=134, 
    off=0xbfff8758, RR=0xcd12a30) at rfc1035.c:407
407                 return 1;
(gdb) p *off
$1 = 137
(gdb) p sz
$2 = 134
(gdb) p *RR
$3 = {name = "26.209.237.80.in-addr.arpa", '\0' <repeats 101 times>, 
  type = 12, class = 1, ttl = 69046, rdlength = 81, 
  rdata = 0xcd12ac8 "ds80-237-209-26.dedicated.hosteurope.de"}
(gdb) p s
$4 = 10496
(gdb) $ exit


BTW at http://lists.debian.org/debian-security/2005/08/msg00144.html and
http://lists.debian.org/debian-security/2005/08/msg00128.html there
is a possible explanation for the failing assert in rfc1035RRUnpack.



Comment 28 Martin Stransky 2005-10-05 15:38:28 UTC
Great! Thanks for report, patch will be here ASAP.

Comment 29 Martin Stransky 2005-10-05 21:35:28 UTC
The new testing binaries (18.assert) are here:

http://people.redhat.com/stransky/debug/compile/

Comment 30 Martin Stransky 2005-10-06 05:42:28 UTC
Hold this package, I've found more problematic places so the new package will be
here soon.

Comment 31 Martin Stransky 2005-10-06 08:35:25 UTC
The new testing binaries (19.assert) are here:

http://people.redhat.com/stransky/debug/compile/

Comment 32 Martin Stransky 2005-10-06 15:02:40 UTC
(In reply to comment #27)
> BTW at http://lists.debian.org/debian-security/2005/08/msg00144.html and
> http://lists.debian.org/debian-security/2005/08/msg00128.html there
> is a possible explanation for the failing assert in rfc1035RRUnpack.

Thanks, it can be the another problem so packages with these patches (and all
from 19.assert) comming soon.



Comment 33 Norbert Warmuth 2005-10-06 15:34:54 UTC
19.assert just failed again with "rfc1035.c:417: rfc1035RRUnpack: Assertion `(*off) <= sz' failed" 
(unfortuantely without coredump because I forget to run ulimit). 


Comment 34 Martin Stransky 2005-10-06 16:26:17 UTC
Packages with fixes that you proposed (20.assert) are here:

http://people.redhat.com/stransky/debug/compile/


Comment 38 Norbert Warmuth 2005-10-12 07:08:45 UTC
Thanks! 20.assert works. No crashes since I installed it on 2005-10-07 07:00 UTC.



Comment 39 Martin Stransky 2005-10-13 14:00:08 UTC
Thanks for the feedback.

There are requests for update to current upstream, so you can write your comment
here. (Bug 170390, Bug 170392)

Comment 42 Martin Stransky 2005-10-18 08:16:19 UTC
*** Bug 163068 has been marked as a duplicate of this bug. ***

Comment 45 Martin Stransky 2005-10-18 13:48:29 UTC
Page with packages from upstream and hopefully fixed packages for RHEL3/4 is here:

http://people.redhat.com/stransky/squid/

Comment 54 Martin Stransky 2005-11-15 12:27:10 UTC
The new release-candidate packages for RHEL3/4 are available here:

http://people.redhat.com/stransky/squid/


Comment 55 Martin Stransky 2005-11-22 08:47:42 UTC

*** This bug has been marked as a duplicate of 165367 ***