Bug 458932 - named TCP connections hang
named TCP connections hang
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: bind (Show other bugs)
5.3
All Linux
low Severity low
: rc
: ---
Assigned To: Adam Tkac
Petr Sklenar
: EasyFix, OtherQA, Patch
Depends On: 456417
Blocks:
  Show dependency treegraph
 
Reported: 2008-08-13 05:47 EDT by Adam Tkac
Modified: 2013-04-30 19:41 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 17:16:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Adam Tkac 2008-08-13 05:47:29 EDT
+++ This bug was initially created as a clone of Bug #456417 +++

Escalated to Bugzilla from IssueTracker

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:47 EDT ---

I believe this was introduced following an upgrade to bind-9.2.4-28.0.1.el4, though I'm not certain which revision we were at previously. It appears that the inbound queue of TCP connections get's wedged on a per-interface basis.

We see this on our recursive servers that do a fair deal of volume (servicing SMTP outbound servers), and I'm not sure what triggers it other than the passage of time. One of our internal zones had a massive authority/additional section (requiring every query against it to be truncated and thus needing a TCP connection), and we were running into this every few hours. To work-around, we enabled minimal-responses, and now it took a day to see the problem again.

The only apparent means of resolving this is to restart named.

It can be seen pretty clearly here. The first request illustrates that the name server is able to respond normally:

fs1.la:~ $ dig @fs1.la.vclk.net. .

; <<>> DiG 9.2.4 <<>> @fs1.la.vclk.net. .
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7086
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      A

;; AUTHORITY SECTION:
.                       10595   IN      SOA     A.ROOT-SERVERS.NET.
NSTLD.VERISIGN-GRS.COM. 2008071601 1800 900 604800 86400

;; Query time: 0 msec
;; SERVER: 192.168.136.191#53(192.168.136.191)
;; WHEN: Wed Jul 16 19:04:49 2008
;; MSG SIZE  rcvd: 92

fs1.la:~ $


...however, a query against the same (root) zone with a TCP query hangs:

fs1.la:~ $ dig @fs1.la.vclk.net. . +vc

; <<>> DiG 9.2.4 <<>> @fs1.la.vclk.net. . +vc
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
fs1.la:~ $


...though going in via a separate interface (localhost, in this case), still works fine:


fs1.la:~ $ dig @localhost . +vc

; <<>> DiG 9.2.4 <<>> @localhost . +vc
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48280
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      A

;; AUTHORITY SECTION:
.                       10566   IN      SOA     A.ROOT-SERVERS.NET.
NSTLD.VERISIGN-GRS.COM. 2008071601 1800 900 604800 86400

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Jul 16 19:05:18 2008
;; MSG SIZE  rcvd: 92

fs1.la:~ $


...additionally, not that there's a perfectly reasonable number of connections (we bumped up 'tcp-clients' to 128 already, initially thinking that Redhat may be using smaller than ISC's default of 100):

fs1.la:~ $ date; netstat -n --tcp | awk '$4 == "192.168.136.191:53" && $NF ==
"ESTABLISHED"'
Wed Jul 16 19:05:29 PDT 2008
tcp       33      0 192.168.136.191:53          192.168.138.137:58996      
ESTABLISHED
tcp       33      0 192.168.136.191:53          192.168.138.137:58998      
ESTABLISHED
tcp       32      0 192.168.136.191:53          192.168.138.135:51298      
ESTABLISHED
tcp       33      0 192.168.136.191:53          192.168.138.133:34745      
ESTABLISHED
fs1.la:~ $


...it should be noted that once wedged, those connections persist and don't appear to be going away (same tuples are seen for atleast 15-30m). In the mean time, all new connections are getting wedged as SYN_SENT:

fs1.la:~ $ netstat -n --tcp | awk '$4 == "192.168.136.191:53" { print $NF }' | uniq -c
     77 SYN_RECV
      4 ESTABLISHED
fs1.la:~ $


...though we know that ACK is getting through, as querying on the machine against itself shows the client side of the connection (:53601 in this example) advancing to ESTABLISHED:

fs1.la:~ $ netstat -n | awk '/192.168.136.191/ && $5 ~ /192.168.136.191/'
tcp        0      0 192.168.136.191:53          192.168.136.191:53599      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53597      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53596      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53598      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53600      
SYN_RECV
tcp        0     20 192.168.136.191:53600       192.168.136.191:53         
FIN_WAIT1
tcp        0     20 192.168.136.191:53598       192.168.136.191:53         
FIN_WAIT1
tcp        0     20 192.168.136.191:53599       192.168.136.191:53         
FIN_WAIT1
fs1.la:~ $ dig @fs1.la +vc &
[1] 61824
fs1.la:~ $ netstat -n | awk '$4 ~ /192.168.136.191/ && $5 ~ /192.168.136.191/'
tcp        0      0 192.168.136.191:53          192.168.136.191:53599      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53601      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53597      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53596      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53598      
SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:53600      
SYN_RECV
tcp        0     20 192.168.136.191:53600       192.168.136.191:53         
FIN_WAIT1
tcp        0     19 192.168.136.191:53601       192.168.136.191:53         
ESTABLISHED
tcp        0     20 192.168.136.191:53598       192.168.136.191:53         
FIN_WAIT1
tcp        0     20 192.168.136.191:53599       192.168.136.191:53         
FIN_WAIT1
fs1.la:~ $
; <<>> DiG 9.2.4 <<>> @fs1.la +vc
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached

[1]+  Exit 9                  dig @fs1.la +vc
fs1.la:~ $

This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:48 EDT ---

File uploaded: named.conf

This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171
it_file 143522

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:48 EDT ---

Attaching the named.conf for reference. Additionally, I just had gdb
generate a core file from the same instance of named described in the
original notes; its up at dropbox.redhat.com now as 192171-named.core.bz2.


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:49 EDT ---

In case its relevant, this is chroot'ed with the stock bind-chroot
packages and SELinux is enforcing also using Redhat stock configs.


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:50 EDT ---

File uploaded: sysreport.192171.tar.bz2

This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171
it_file 143730

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:51 EDT ---

Just had another name server end up in a similar state. Collected data and
rolled it up to 1912171-fs2.la.tar on dropbox.redhat.com.

Has there been any word back from engineering yet? Namely, is the data
sufficient so far, or is there any additional details they'd like
collected?


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:52 EDT ---

Notes to myself:

The filename at dropbox is 192171-fs2.la.tar

I've checked with bind package maintainer but no lucky, he isn't not
aware of
any problem like that and also said that the last patch is very unlikely
to be 
the root cause.

Flavio


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:53 EDT ---

Thanks for the update, Neil.

Though it should be implied by the success of the successful VC query
against localhost, this does not appear to be an fd leak; the named
processes in question (still have them in a 'broken' state) on these
machines have 27 and 31 fd's listed under /proc.

While we're waiting on a resolution:

  * Are there any other additional details that we could provide that
would assist in resolution? Before we start restarting bind when this
condition is seen, I'd like to make sure we're not throwing away useful
data.

  * Are there any caveats to the use of 'minimal-responses' when used on
a recursive server? As stated earlier, this was the most apparent
workaround, as it minimized our use of TCP queries. I'm not aware of any
drawbacks, since all of the clients are dumb (ie. straight libresolv, no
other name servers or lwresd clients), but thought I'd make sure.


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:54 EDT ---

Kevin,

> * Are there any other additional details that we could provide that
would 
> assist in resolution? Before we start restarting bind when this
condition is 
> seen, I'd like to make sure we're not throwing away useful data.

A couple of things have come up that would be useful diagnostics here. 
1. Can you capture a tcpdump of the connection while reproducing this
issue so we can verify that the connection flow is correct?

2. Can you provide an strace output of the named process before starting
the connection that reproduces it?  The idea being that we would like to
make sure that the connection request reaches the process and what it does
with the fd. 
This command:
strace -fFtT -s 4096 -p <named pid> -o named.log 

3. Also, can you pinpoint a little more closer the date and time of the
occurrence so that we can focus the study of the logs around the time the
issue occurs?  

Of possible, I think maybe what would be best here is to reproduce this on
a system and gather the data above (1 & 2), then create and upload a new
sysreport that has logs and data that directly correspond to the
reproducer in question (along with the time/date info to answer #3), then
upload all this together.  

> * Are there any caveats to the use of 'minimal-responses' when used on
a 
> recursive server? As stated earlier, this was the most apparent
workaround, 
> as it minimized our use of TCP queries. I'm not aware of any drawbacks,
since 
> all of the clients are dumb (ie. straight libresolv, no other name
servers or 
> lwresd clients), but thought I'd make sure.

I don't know, but I'll check on this. I've asked the package maintainer
and will see what he says and will check with support engineering as well.



Neil




Internal Status set to 'Waiting on Customer'
Status set to: Waiting on Client

This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:55 EDT ---

Just sent this data up to dropbox.redhat.com as 192171-named-fs2.la.tar.bz2
and 192171-named-fs1.la.tar.bz2. The strace left the process hung hard
(required kill -9) when it detached, so it might be a bit before we can
reproduce again.

The client in the 'fs2' case is 192.168.136.10, though I realized this
client would be noisy. Total trace is only ~60 seconds, so it shouldn't
be too terrible.

For the 'fs1' case, I hit it from 192.168.130.47 and all of the traffic
from it is a reproduce attempt.

In both cases, I did a UDP-based query first to help signal
start-of-testing.


This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:55 EDT ---

Hi there,

on fs1 - there is a full TCP connection opening and also the standard 
query retransmissions. Assuming that the traffic dump was taken on the
server and there is no firewall blocking DNS queries it indicates that 
the problems seems to be in the application itself.

on fs2 - there are UDP queries working and TCP queries showing the same
problem as in fs1 case.

The strace output shows the test query 'this.is.a.test' being received:
16120 13:40:54 recvmsg(22, {msg_name(16)={sa_family=AF_INET,
sin_port=htons(37902), sin_addr=inet_addr("192.168.136.10")},
msg_iov(1)=[{"\\374\\37\\1\\0\\0\\1\\0\\0\\0\\0\\0\\0\\4this\\2is\\1a\\4test\\...},
msg_flags=0}, 0) = 32 <0.000016>

but not answered and there is no syscall error, i.e. sendmsg() failing to
write the answer. Both straces shows the same behavior.

Found this message in logs many times:
Jul 13 05:23:23 fs1 kernel: audit(1215951803.850:3275): avc:  denied  {
write } for  pid=61279 comm="rndc"
name="cf_fs1_la_vclk_net_valueclick_com_2008-07-13--05-23-05" dev=sda3
ino=196796 scontext=root:system_r:ndc_t tcontext=root:object_r:var_t
tclass=file

That should be related to selinux but it is in 'permissive' mode.

Let me check the code and review the strace again before ask for anything
else.

Flavio




This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:56 EDT ---

I have named running here and gdb can attach okay showing all symbols
correctly,
but it doesn't work with both cores provided. Doing an effort to get a
core 
from named process I tried to send SIGABRT but no core was created, even
setting ulimit -c unlimited. I also tried to attach gdb to a running
named
and then force to dump a core, but gdb crashed before that.

I would ask to raise the log/trace level and then see if we can spot 
something but I have no idea if it will help or not.

I'll pass this one to engineering.

Flavio.

[root@dell-pe830-01 ~]# pidof named
4237
[root@dell-pe830-01 ~]# gdb /usr/sbin/named 4237
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

Attaching to program: /usr/sbin/named, process 4237
<snipped>

[New Thread -1240192080 (LWP 4241)]
../../gdb/linux-nat.c:901: internal-error: lin_lwp_attach_lwp: Assertion
`pid == GET_LWP (ptid) && WIFSTOPPED (status) && WSTOPSIG (status)'
failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) 
[New Thread -1240192080 (LWP 4241)]
../../gdb/linux-nat.c:901: internal-error: lin_lwp_attach_lwp: Assertion
`pid == GET_LWP (ptid) && WIFSTOPPED (status) && WSTOPSIG (status)'
failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)



This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:57 EDT ---

Hi,

Last question before I escalate - Is is possible to you disable selinux 
and check if still reproduces it?

I think selinux was updated and according with comment below it was
enabled:
Event posted 07-17-2008 12:41am BRT by kgraham@valueclick.com 	
In case its relevant, this is chroot'ed with the stock bind-chroot
packages and SELinux is enforcing also using Redhat stock configs.

thanks,
Flavio

Internal Status set to 'Waiting on Support'

This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from tao@redhat.com on 2008-07-23 09:48:58 EDT ---

It does not appear to (note that again there are 4 established connections,
this seems to be a magic number), atleast for the purposes of flipping to
permissive mode, or do they want it disabled altogether at boot-time?

fs1.la:~ $ dig @localhost +vc

; <<>> DiG 9.2.4 <<>> @localhost +vc
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15908
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       486093  IN      NS      I.ROOT-SERVERS.NET.
.                       486093  IN      NS      J.ROOT-SERVERS.NET.
.                       486093  IN      NS      K.ROOT-SERVERS.NET.
.                       486093  IN      NS      L.ROOT-SERVERS.NET.
.                       486093  IN      NS      M.ROOT-SERVERS.NET.
.                       486093  IN      NS      A.ROOT-SERVERS.NET.
.                       486093  IN      NS      B.ROOT-SERVERS.NET.
.                       486093  IN      NS      C.ROOT-SERVERS.NET.
.                       486093  IN      NS      D.ROOT-SERVERS.NET.
.                       486093  IN      NS      E.ROOT-SERVERS.NET.
.                       486093  IN      NS      F.ROOT-SERVERS.NET.
.                       486093  IN      NS      G.ROOT-SERVERS.NET.
.                       486093  IN      NS      H.ROOT-SERVERS.NET.

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 22 14:53:44 2008
;; MSG SIZE  rcvd: 228

fs1.la:~ $ dig @192.168.136.191 +vc

; <<>> DiG 9.2.4 <<>> @192.168.136.191 +vc
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
fs1.la:~ $
fs1.la:~ $ dig @192.168.136.191

; <<>> DiG 9.2.4 <<>> @192.168.136.191
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31100
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       486030  IN      NS      M.ROOT-SERVERS.NET.
.                       486030  IN      NS      A.ROOT-SERVERS.NET.
.                       486030  IN      NS      B.ROOT-SERVERS.NET.
.                       486030  IN      NS      C.ROOT-SERVERS.NET.
.                       486030  IN      NS      D.ROOT-SERVERS.NET.
.                       486030  IN      NS      E.ROOT-SERVERS.NET.
.                       486030  IN      NS      F.ROOT-SERVERS.NET.
.                       486030  IN      NS      G.ROOT-SERVERS.NET.
.                       486030  IN      NS      H.ROOT-SERVERS.NET.
.                       486030  IN      NS      I.ROOT-SERVERS.NET.
.                       486030  IN      NS      J.ROOT-SERVERS.NET.
.                       486030  IN      NS      K.ROOT-SERVERS.NET.
.                       486030  IN      NS      L.ROOT-SERVERS.NET.

;; Query time: 0 msec
;; SERVER: 192.168.136.191#53(192.168.136.191)
;; WHEN: Tue Jul 22 14:54:47 2008
;; MSG SIZE  rcvd: 228

fs1.la:~ $
fs1.la:~ $ netstat --tcp -n | awk '$4 == "192.168.136.191:53"'
tcp        0      0 192.168.136.191:53          192.168.136.191:34380     
 SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.10:55021      
 SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.191:34381     
 SYN_RECV
tcp        0      0 192.168.136.191:53          192.168.136.10:54984      
 SYN_RECV
tcp       37      0 192.168.136.191:53          192.168.138.141:36528     
 ESTABLISHED
tcp       33      0 192.168.136.191:53          192.168.138.133:50853     
 ESTABLISHED
tcp       33      0 192.168.136.191:53          192.168.138.132:50037     
 ESTABLISHED
tcp       31      0 192.168.136.191:53          192.168.138.132:49998     
 ESTABLISHED
fs1.la:~ $
fs1.la:~ # setenforce 0
fs1.la:~ # dig @192.168.136.191 +vc

; <<>> DiG 9.2.4 <<>> @192.168.136.191 +vc
; (1 server found)
;; global options:  printcmd
;; connection timed out; no servers could be reached
fs1.la:~ #



This event sent from IssueTracker by fleitner  [Support Engineering Group]
 issue 192171

--- Additional comment from kgraham@valueclick.com on 2008-07-23 14:04:25 EDT ---

The avc denies in comment 11 are indeed bogus, they're just noise from CFEngine;
we haven't seen any SELinux activity reported against named itself.

As noted in comment 14, when we see this bug manifest itself, there are always 4
TCP connections established (though tcp-clients has been manually set to 128
after our initial suspicion that Redhat wasn't using ISC's default of 100). The
client side of these are qmail-remote's (linked against the system's libresolv)
that are waiting for a read to return on that socket.

--- Additional comment from atkac@redhat.com on 2008-07-30 07:26:11 EDT ---

After investigation this problem might related to latest security patch. From
logs and other files server when this problem happen is under heavy load.

Would it be possible tell me if message like "internal_accept: fcntl() failed:
Too many open files" is in system log when TCP query timed out, please?
Also would it be possible check if parameter "recursive-clients 950;" in options
statement in named.conf helps? Thanks

--- Additional comment from kgraham@valueclick.com on 2008-07-30 12:30:54 EDT ---

re comment 16 -- I had missed it in the noise of startup (our slaves are
file-less, so they always do a full set of axfr's at initialization), but yes,
we do see a handful of the fcntl errors, though they seem to occur at the onset
of the wedge; once wedged, they aren't logged.

The defaults for recursive-clients is 1000 -- is the goal of taking it down to
provide more buffer between it and 1k? Would increasing nfiles ulimit achieve
your goal without decreasing client concurrency?

--- Additional comment from atkac@redhat.com on 2008-07-30 12:52:40 EDT ---

(In reply to comment #17)
> The defaults for recursive-clients is 1000 -- is the goal of taking it down to
> provide more buffer between it and 1k? Would increasing nfiles ulimit achieve
> your goal without decreasing client concurrency?

Yes, increase of nfiles limit should also helps.

--- Additional comment from kgraham@valueclick.com on 2008-08-06 12:11:06 EDT ---

re comment 19 -- that appears to have done the trick, I haven't been able to reproduce since increasing ulimit. It strikes me that there's two things that need to happen here.

The first seems fairly trival -- bind should setrlimit (and bail if unable to) to something akin to recursive-clients + ( 2 * max-tranfers-in ) + max-transfers-out + overhead (this would presumably also include RH-specific updates to accommodate defaults, such as a nfiles increase in limits.conf).

The second is the wedge that resulted after the original condition occurred. I haven't tried (and would probably fail if I had) to find the relevant areas of code, but presumably even with the fcntl errors, this still should not have occurred.

Is this an accurate assessment and has ISC weighed in?

--- Additional comment from atkac@redhat.com on 2008-08-12 11:15:57 EDT ---

This problem is already fixed in upstream:

2394.   [bug]           Default configuration options set the limit for
                        open files to 'unlimited' as described in the
                        documentation. [RT #18331]

Proposed patch will be attached.

--- Additional comment from atkac@redhat.com on 2008-08-12 11:16:54 EDT ---

Created an attachment (id=314103)
proposed patch

--- Additional comment from kgraham@valueclick.com on 2008-08-12 11:54:00 EDT ---

re comment 20 -- excellent, thanks. Two things -- does this need to be cloned against RHEL5, and can you please arrange for bug 455540 and bug 455564 to be included in the errata packages?
Comment 7 Petr Sklenar 2008-11-28 10:12:46 EST
Bug requires customer verification. Please provide customer with latest RHEL5.3 Beta packages
Comment 8 Kevin Graham 2008-11-30 15:55:44 EST
re comment 7 -- if this is the same proposed patch as Adam's patch in bug 456417, it has already been verified. That patch implements the fix within named that we had used as a configuration workaround.
Comment 9 Petr Sklenar 2008-12-01 06:31:35 EST
re Comment #8
Hello,
thank you for testing on rhel4 - patch is same for rhel5. If it is possible to provide testing with rhel 5U3 (using the beta) it would be more safety.
Comment 14 errata-xmlrpc 2009-01-20 17:16:42 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0246.html

Note You need to log in before you can comment on or make changes to this bug.