Bug 10479

Summary: Bind and caching DNS stalls
Product: [Retired] Red Hat Linux Reporter: mal
Component: bindAssignee: Bernhard Rosenkraenzer <bero>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: dr, m.bizzarri, rmiddle
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2000-05-24 15:42:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description mal 2000-03-31 15:09:57 UTC
Hi,
There is a problem with bind-8.2.2_P3-1
and, I believe, bind-utils-8.2.2_P5-9
when using it as a caching DNS (with config
files from caching-nameserver packages).
The problem is:
After some time (1 day to 1 week)
caching DNS just stops resolving some (not all!!!)
host names.
For one host I may get a perfect DNS response
for another host - I get nothing, it just timeouts.
The DNS to which caching DNS sends the request
perfectly resolves all histnames,
caching DNS does not.
This happens randomly with
periodicity from 1 day to i week.
The only way to fix this problem is to restart named.
After doing
/etc/rc.d/init.d/named stop
/etc/rc.d/init.d/named start
caching DNS stars working fine and resolves all hostnames.
As a workaround I added an entry to /etc/cron.daily/
to restart bind daily.
This is a workaround, but this was the
only way I found to get caching DNS working reliably enough.

Comment 1 mal 2000-03-31 15:11:59 UTC
In addition:
On should read bind-8.2.2_P5-9
above, not bind-utils-8.2.2_P5-9.
I jsut copied and pasted wrong package name.

Comment 2 Bernhard Rosenkraenzer 2000-05-05 14:15:59 UTC
I can't reproduce this - do you get any odd messages in /var/log/messages when
this happens?

Comment 3 nkadel 2000-05-23 21:51:59 UTC
I haven't seen the problem in weeks of running bind-8.2.2_P5 on several systems,
as both a stand-alone DNS server and a caching-nameserver.

Comment 4 mal 2000-05-24 15:42:59 UTC
I still have this problem with bind-8.2.2_P3-1
The messages in /var/log/messages are pretty much as usual
(XSTAT, etc)
with except of

/var/log/messages.2:May 10 12:50:28 columbusserver named[645]: Lame server on
'mail.yahoo.com.session.rservices.com' (in 'RSERVICES.COM'?): [199.171.195.8].53
'RITIG9.RIT.REUTERS.COM'
/var/log/messages.2:May 10 14:16:16 columbusserver named[645]: Lame server on
'mail.yahoo.com.session.rservices.com' (in 'RSERVICES.COM'?): [199.171.195.8].53
'RITIG9.RIT.REUTERS.COM'

while the server can not resolve yahoo.com

Most of the workstations are configured to use only Linux DNS,
but some also have DNS servers from reuters.
I do not know how this rservices.com DNS was picked up.

Comment 5 Henri Schlereth 2000-08-02 01:55:21 UTC
Unable to duplicate in bind-8.2.2-P5-24 or previous versions. Really should be upgrading to P5 anyway.
Possible lame server came from an ident off of mail server?

Henri
RH Beta Team

Comment 6 mal 2000-08-07 23:03:32 UTC
The problem persists in 3 different RedHat installations.
Bind version: bind-8.2.2_P5-9
This a relevant part of the log:

Aug  7 16:27:46 localhost named[538]: Cleaned cache of 87 RRsets
Aug  7 16:27:46 localhost named[538]: USAGE 965680066 965323665 CPU=12.04u/3.41s
CHILDCPU=0u/0s
Aug  7 16:27:46 localhost named[538]: NSTATS 965680066 965323665 A=2331
Aug  7 16:27:46 localhost named[538]: XSTATS 965680066 965323665 RR=3219 RNXD=73
RFwdR=2295 RDupR=4 RFail=2 RFErr=0 RErr=0 RAXFR=0 RLame=8 ROpts=0 SSysQ=641
SAns=492 SFwdQ=1805 SDupQ=362 SErr=0 RQ=2331 RIQ=0 RFwdQ=0 RDupQ=176 RTCP=0
SFwdR=2295 SFail=0 SFErr=0 SNaAns=479 SNXD=10
Aug  7 17:27:46 localhost named[538]: Cleaned cache of 126 RRsets
Aug  7 17:27:46 localhost named[538]: USAGE 965683666 965323665 CPU=12.47u/3.59s
CHILDCPU=0u/0s
Aug  7 17:27:46 localhost named[538]: NSTATS 965683666 965323665 A=2443
Aug  7 17:27:46 localhost named[538]: XSTATS 965683666 965323665 RR=3335 RNXD=81
RFwdR=2377 RDupR=4 RFail=2 RFErr=0 RErr=0 RAXFR=0 RLame=8 ROpts=0 SSysQ=660
SAns=524 SFwdQ=1878 SDupQ=415 SErr=0 RQ=2443 RIQ=0 RFwdQ=0 RDupQ=193 RTCP=0
SFwdR=2377 SFail=0 SFErr=0 SNaAns=511 SNXD=12
Aug  7 17:28:29 localhost named[538]: named shutting down
Aug  7 17:28:29 localhost named[538]: USAGE 965683709 965323665 CPU=12.48u/3.6s
CHILDCPU=0u/0s
Aug  7 17:28:29 localhost named[538]: NSTATS 965683709 965323665 A=2443
Aug  7 17:28:29 localhost named[538]: XSTATS 965683709 965323665 RR=3335 RNXD=81
RFwdR=2377 RDupR=4 RFail=2 RFErr=0 RErr=0 RAXFR=0 RLame=8 ROpts=0 SSysQ=660
SAns=524 SFwdQ=1878 SDupQ=415 SErr=0 RQ=2443 RIQ=0 RFwdQ=0 RDupQ=193 RTCP=0
SFwdR=2377 SFail=0 SFErr=0 SNaAns=511 SNXD=12

HERE IT STOPED WORKING AND WAS RESTARTED:
IT WAS NOT ABLE TO RESOLVE altavista.com
AFTER RESTART BIND STARTED WORKING OK.

Aug  7 17:28:30 localhost named: named shutdown succeeded
Aug  7 17:28:38 localhost named[4297]: starting.  named 8.2.2-P5 Wed Apr 19
18:39:00 EDT 2000 ^Iroot@srv:/usr/src/redhat/BUILD/bind-8.2.2_P5/src/bin/named
Aug  7 17:28:38 localhost named[4297]: hint zone "" (IN) loaded (serial 0)
Aug  7 17:28:38 localhost named[4297]: Zone "0.0.127.in-addr.arpa" (file
named.local): No default TTL set using SOA minimum instead
Aug  7 17:28:38 localhost named[4297]: master zone "0.0.127.in-addr.arpa" (IN)
loaded (serial 1997022700)
Aug  7 17:28:38 localhost named[4297]: listening on [127.0.0.1].53 (lo)
Aug  7 17:28:39 localhost named[4297]: listening on [192.168.3.1].53 (eth1)
Aug  7 17:28:39 localhost named[4297]: Forwarding source address is
[0.0.0.0].1040
Aug  7 17:28:39 localhost named: named startup succeeded
Aug  7 17:28:39 localhost named[4298]: group = 25
Aug  7 17:28:39 localhost named[4298]: user = named
Aug  7 17:28:39 localhost named[4298]: Ready to answer queries.
Aug  7 18:28:38 localhost named[4298]: Cleaned cache of 49 RRsets
Aug  7 18:28:38 localhost named[4298]: USAGE 965687318 965683719 CPU=0.77u/0.1s
CHILDCPU=0u/0s
Aug  7 18:28:38 localhost named[4298]: NSTATS 965687318 965683719 A=130
Aug  7 18:28:38 localhost named[4298]: XSTATS 965687318 965683719 RR=160 RNXD=7
RFwdR=122 RDupR=0 RFail=0 RFErr=0 RErr=0 RAXFR=0 RLame=0 ROpts=0 SSysQ=23
SAns=35 SFwdQ=90 SDupQ=38 SErr=0 RQ=130 RIQ=0 RFwdQ=0 RDupQ=17 RTCP=0 SFwdR=122
SFail=0 SFErr=0 SNaAns=35 SNXD=0
Aug  7 18:39:52 localhost named[4298]: Lame server on 'www.eprisenow.com' (in
'EPRISENOW.COM'?): [12.127.16.69].53 'CMTU.MT.NS.ELS-GMS.ATT.NET'
Aug  7 18:39:52 localhost named[4298]: Lame server on 'www.eprisenow.com' (in
'EPRISENOW.COM'?): [199.191.128.105].53 'CBRU.BR.NS.ELS-GMS.ATT.NET'