88526 – bind dies without error couple of times per week

Bug 88526 - bind dies without error couple of times per week

Summary: bind dies without error couple of times per week

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	bind
Sub Component:
Version:	6.2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Daniel Walsh
QA Contact:	Ben Levenson
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-04-10 17:40 UTC by Toni Willberg
Modified:	2007-04-18 16:52 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2003-04-11 14:54:50 UTC
Embargoed:

Attachments	(Terms of Use)

Description Toni Willberg 2003-04-10 17:40:03 UTC

Description of problem:

 Bind running in this box dies without any
 error in /var/log/messages couple of times per week.

System:
 bind-9.2.1-0.6x.3
 kernel-2.2.22-6.2.3
 glibc-2.1.3-29
 redhat-release-6.2-1

The server is RH6.2 with all official upgrade rpms and it's a public DNS server
for dozens of zones. The load of the server is not very high, but it's not just
idling either. I have other server with all the same rpms installed, and bind
never crases on that one, and both servers serve the same zones.

I'm about to upgrade the server to more recent RH release, but as RH6.2 is still
widely used, and if this is a "real" bug, IMHO it should be tracked down and fixed.

I'd like to know if there are better ways to hunt this bug. I ran named process
in strace with following command:
  strace -t -q -T -F -f named -g -s -u named -d 999

Here's clip of last lines of stracing:
>>>
Apr 09 22:56:18.131 fctx 0x8231350: try
Apr 09 22:56:18.132 fctx 0x8231350: cancelqueries
Apr 09 22:56:18.132 fctx 0x8231350: getaddresses
Apr 09 22:56:18.134 expire_v4 set to MIN(2147483647,1050004578) import_rdataset
Apr 09 22:56:18.134 dns_adb_createfind: found A for name 0x81ebc90 in db
Apr 09 22:56:18.135 expire_v4 set to MIN(2147483647,1050004578) import_rdataset
Apr 09 22:56:18.136 dns_adb_createfind: found A for name 0x81ebbd8 in db
Apr 09 22:56:18.137 expire_v4 set to MIN(2147483647,1050004578) import_rdataset
Apr 09 22:56:18.138 dns_adb_createfind: found A for name 0x821aa80 in db
Apr 09 22:56:18.139 expire_v4 set to MIN(2147483647,1050004578) import_rdataset
Apr 09 22:56:18.139 dns_adb_createfind: found A for name 0x821a9c8 in db
Apr 09 22:56:18.140 fctx 0x8231350: query
Apr 09 22:56:18.141 resquery 0x821f038 (fctx 0x8231350): send
Apr 09 22:56:18.141 dispatch 0x80cbac8 response 0x811c890 205.152.16.20#53:
attached to task 0x80cc828
Apr 09 22:56:18.143 resquery 0x821f038 (fctx 0x8231350): sent
Apr 09 22:56:18.143 resquery 0x821f038 (fctx 0x8231350): senddone
Apr 09 22:56:18.300 socket 0x80cbc80: dispatch_recv:  event 0x81e4ed8 -> task
0x80cbd58
Apr 09 22:56:18.300 socket 0x80cbc80: internal_recv: task 0x80cbd58 got event
0x80cbcd4
Apr 09 22:56:18.301 socket 0x80cbc80 205.152.16.20#53: packet received correctly
Apr 09 22:56:18.302 dispatch 0x80cbac8: got packet: requests 1, buffers 1, recvs 1
Apr 09 22:56:18.302 dispatch 0x80cbac8: got valid DNS message header, /QR 1, id
30437
Apr 09 22:56:18.303 dispatch 0x80cbac8: search for response in bucket 61: found
Apr 09 22:56:18.303 dispatch 0x80cbac8 response 0x811c890 205.152.16.20#53: [a]
Sent event 0x810ed38 buffer 0x8230348 len 4096 to task 0x80cc828
Apr 09 22:56:18.304 sockmgr 0x808b548: watcher got message -3
Apr 09 22:56:18.305 sockmgr 0x808b548: watcher got message -2
Apr 09 22:56:18.305 socket 0x80cbc80: socket_recv: event 0x82155b8 -> task 0x80cbd58
Apr 09 22:56:18.306 resquery 0x821f038 (fctx 0x8231350): response
 <unfinished ...>
22:56:18 --- SIGSEGV (Segmentation fault) ---
22:56:18 +++ killed by SIGSEGV +++
<<<

Comment 1 Daniel Walsh 2003-04-11 14:54:50 UTC

Have you tried the current release of bind, to see if this still happens
bind-9.2.2-*

Dan

Note You need to log in before you can comment on or make changes to this bug.