Bug 462060

Summary: named service crashes with an assertion failed message
Product: Red Hat Enterprise Linux 4 Reporter: Adam Tkac <atkac>
Component: bindAssignee: Adam Tkac <atkac>
Status: CLOSED ERRATA QA Contact: Martin Cermak <mcermak>
Severity: high Docs Contact:
Priority: high    
Version: 4.7CC: gimre, jwest, mprpic, ofourdan, ovasik, ralph, redhat-bugzilla, tao, william.smargiassi
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Under certain conditions, the named daemon could exit due to an assertion failure. The following message was logged to /var/log/messages: named: socket.c:1649: INSIST(!sock->pending_recv) failed named: exiting (due to assertion failure) This update provides a fix to the socket module which prevents this assertion from failing, thus resolving the problem.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-16 14:04:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 455802    
Bug Blocks: 552534    
Attachments:
Description Flags
proposed patch none

Description Adam Tkac 2008-09-12 11:22:49 UTC
+++ This bug was initially created as a clone of Bug #455802 +++

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1

Description of problem:
Since the upgrade to the following:

Version: 9.3.4
Release: 6.0.2.P1.el5_2

We have had an issue with the named service randomly dieing off. In /var/log/messages we see the following errors:

Jul 17 13:00:42 ns-mail named[19409]: socket.c:1649: INSIST(!sock->pending_recv) failed
Jul 17 13:00:42 ns-mail named[19409]: exiting (due to assertion failure)

The server is running bind-chroot and also has the caching-nameserver package installed. In addition to the local related zones, we also have 10 forwards setup in the named.conf file.

This occurred on a fresh install of 5.2 which had the named service running for approximately 36 hours prior to this crash.

Version-Release number of selected component (if applicable):
bind-9.3.4-6.0.2.P1.el5_2

How reproducible:
Sometimes


Steps to Reproduce:
1.This has been random thus far. 
2.A restart of the service is required as it crashes.
3.

Actual Results:


Expected Results:


Additional info:
A point to also note is that we had a 4.6 server running the latest security patched BIND and this same error occurred on it after the recent BIND related security patches. The named.conf was identical on the 4.6 server too. We then decided to migrate to 5.2, in which we have seen this same issue happen today, thus the bug report.

--- Additional comment from atkac on 2008-09-12 07:15:29 EDT ---

This problem is already fixed in upstream:

2406.   [bug]           Sockets could be closed too early, leading to
                        inconsistent states in the socket module. [RT #18298]

Problem will probably fixed together with bug #457036 by rebase to updated upstream version.

Comment 2 Adam Tkac 2008-09-19 13:36:45 UTC
Created attachment 317192 [details]
proposed patch

Comment 3 Bill Smargiassi 2008-10-09 16:05:19 UTC
I've experienced this crash as well.

RHEL 4.5
BIND version: bind-9.2.4-30.el4 (out of date, but I don't patch this system personally)

From our logs:

Oct  8 18:22:28 mlvv9n1x named[25034]: zone xxx.xxx.209.in-addr.arpa/IN: transferred serial 200307xxxx
Oct  8 18:22:28 mlvv9n1x named[25034]: transfer of 'xxx.xxx.209.in-addr.arpa/IN' from xxx.xxx.xxx.132#53: end of transfer
Oct  8 18:22:28 mlvv9n1x named[25034]: zone xxx.xxx.209.in-addr.arpa/IN: sending notifies (serial 200307xxxx)
Oct  8 18:22:41 mlvv9n1x named[25034]: socket.c:1615: INSIST(!sock->pending_recv) failed
Oct  8 18:22:41 mlvv9n1x named[25034]: exiting (due to assertion failure)

Comment 4 RHEL Program Management 2008-10-31 16:38:24 UTC
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".

Comment 5 Bill Smargiassi 2008-12-04 17:07:53 UTC
RHEL 4.7 now. Same bind version which now appears to be up to date?

Yep, still happening:

Dec  3 17:30:34 mlvv9n1x named[3834]: socket.c:1615: INSIST(!sock->pending_recv) failed
Dec  3 17:30:34 mlvv9n1x named[3834]: exiting (due to assertion failure)

If I have to request a support ticket, then why do you provide open access to bugzilla? Please reevaluate this bug fix dwhich has caused 2 BIND crashes on our primary internal name server.

Comment 6 Adam Tkac 2008-12-18 14:00:36 UTC
(In reply to comment #5)
> If I have to request a support ticket, then why do you provide open access to
> bugzilla? Please reevaluate this bug fix dwhich has caused 2 BIND crashes on
> our primary internal name server.

Could you fill support ticket, please? Bugs with corresponding support ticket are tracked with higher priority than bugs without support ticket.

Direct access to bugzilla is mainly for people who don't use RH support.

Comment 20 Martin Cermak 2010-06-03 07:27:45 UTC
As this happens just occasionally, I'll stay in ON_QA and SanityOnly state.

Comment 21 Martin Prpič 2010-06-11 12:33:03 UTC
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
* under certain conditions, the named daemon could exit due to an assertion failure. The following message was logged to /var/log/messages: named: socket.c:1649: 

INSIST(!sock->pending_recv) failed named: exiting (due to assertion failure) 

This update provides a fix to the socket module which prevents this assertion from failing, thus resolving the problem.

Comment 22 Douglas Silas 2010-06-14 07:07:17 UTC
Technical note updated. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,4 +1,4 @@
-* under certain conditions, the named daemon could exit due to an assertion failure. The following message was logged to /var/log/messages: named: socket.c:1649: 
+Under certain conditions, the named daemon could exit due to an assertion failure. The following message was logged to /var/log/messages: named: socket.c:1649: 
 
 INSIST(!sock->pending_recv) failed named: exiting (due to assertion failure)

Comment 24 errata-xmlrpc 2011-02-16 14:04:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0223.html