Bug 584356
Summary: | Bind fails with assertion | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Daniel Senie <dts> | ||||
Component: | bind | Assignee: | Adam Tkac <atkac> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | qe-baseos-daemons | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.4 | CC: | dts, fdewaley, ovasik, stbulicek | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-03-12 17:03:44 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 798457, 743405 | ||||||
Attachments: |
|
Description
Daniel Senie
2010-04-21 13:28:41 UTC
Would it be possible to attach a backtrace, please? You can obtain it this way: 1. add ENABLE_ZONE_WRITE=yes to your /etc/sysconfig/named 2. run "setsebool named_write_master_zones 1" 3. run `service named restart` (do NOT run "killall -HUP named" or "rndc reload") and wait for a crash. There should be new file in the /var/named directory, called core.XXXX. 4. install bind-debuginfo package (http://kbase.redhat.com/faq/docs/DOC-9908) 5. run "gdb /usr/sbin/named /var/named/core.XXXX" 6. inside gdb session run "t a a bt full" 7. attach gdb output Make sure you attach gdb output as a "private" attachment if it contains any security sensitive information. Thank you in advance. We have experienced this several times now. Takes several days at least for it to break. Attaching gdb session output separtely. Created attachment 416029 [details]
gdb output of named core
Another languishing bug in a critical service. We are moving our name servers away from RedHat to other distros, so as to get versions of BIND that don't crash. We've had a cron job checking to make sure BIND is running, and kicking it when it's not as a work-around since May. It's kind of critical and mission-critical to have one's name servers actually running and reliable. Guess RedHat disagrees. Oh well. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. Eleven months since reported, and we're still running a Perl script every 5 minutes to ensure BIND is running. That script continues to save us regularly, a few times a month now, when BIND falls over dead. We've been migrating much of our DNS to non-RedHat systems to find functional stability. (In reply to comment #3) > Created attachment 416029 [details] > gdb output of named core Unfortunately this backtrace is not sufficient to fix this issue. Can you please try to get more information this way? 1. Put following to your named.conf: logging { channel default_debug { file "data/named.run" versions 3 size 1m; print-category yes; severity debug 99; }; }; 2. Put OPTIONS='-d99' to /etc/sysconfig/named 3. restart named and when it crashes then please attach /var/named/data/named.run* files. Thank you in advance. Happy to add some more debugging output, and pleased that there is finally some interest in getting to the bottom of this. However, there's one challenge. Since I have a script that restarts the daemon automatically when it falls over, I will need to add to that kick script something that saves off the files you want, or else they'll get stomped out of existence. I wanted to ask if there's an alternative, such as a way to specify the file name with a unique ID (e.g. PID) in the third line of the example above. If not, I'll see what I can do in the Perl code to save off the file. Because this is a mission-critical service, having the daemon just not be running until I get a chance to copy the data file out of the way and kicking the thing manually is NOT an option. (In reply to comment #12) > Happy to add some more debugging output, and pleased that there is finally some > interest in getting to the bottom of this. However, there's one challenge. > Since I have a script that restarts the daemon automatically when it falls > over, I will need to add to that kick script something that saves off the files > you want, or else they'll get stomped out of existence. I wanted to ask if > there's an alternative, such as a way to specify the file name with a unique ID > (e.g. PID) in the third line of the example above. If not, I'll see what I can > do in the Perl code to save off the file. > > Because this is a mission-critical service, having the daemon just not be > running until I get a chance to copy the data file out of the way and kicking > the thing manually is NOT an option. Unfortunately there is no way how to append PID number to the debug log files. Your script must be extended a little, for example this way (note I haven't tested code below). Put something like this right before you start crashed named. #!/usr/bin/perl system( if ! [ -d /debuglogfiles ]; then mkdir /debuglogfiles; cp /var/named/data/named.run* /debuglogfiles; fi; ) Then simply attach named.run* files. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Last I checked, Bind was critical infrastructure, and that's what this bug, reported 13 months ago, is all about. I am grateful RedHat has stated this policy today that it will not fix this bug in critical infrastructure. It reconfirms a strategy we are undertaking to move all servers running critical components such as DNS away from RedHat products and to another distribution with a vendor that is doing a far better job of fixing bugs of this sort. We started with RedHat with RHL 2.1, a great many years ago when the company was small. Sorry to say goodbye after all this time. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Since there was no response for needinfo for more that 18 months, I'm closing this issue. |