RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 768702 - BIND will not shutdown or log after log rotation
Summary: BIND will not shutdown or log after log rotation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: bind
Version: 6.2
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Adam Tkac
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-18 04:45 UTC by Daniel McNamara
Modified: 2012-01-25 03:16 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-01-23 10:15:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Daniel McNamara 2011-12-18 04:45:46 UTC
Description of problem:

Seems the race condition from #623122 still exists and becomes very apparent after logrotation (although I have no evidence of a causation link).

After log rotation:

RNDC shows the service is still running:

rndc status
version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6
CPUs found: 16
worker threads: 16
number of zones: 27
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running

Queries will still be responded to (both recursive and locally hosted).

RNDC works - for example you can turn on querylogging

[root@machine ~]# rndc querylog
[root@machine ~]# rndc status
version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6
CPUs found: 16
worker threads: 16
number of zones: 27
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is ON
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running


However the named service no longer writes to the log file. A query can run but it won't be logged.

A restart does not work:

[root@machine ~]# /etc/init.d/named restart
Stopping named: .............umount: /var/named/chroot/var/named: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
                                                           [FAILED]
Starting named: named: already running                     [  OK  ]

Although the named process is running 

[root@machine ~]# ps aux |grep named
named     6233  0.0  0.5 772756 30884 ?        Ssl  Dec12   0:05 /usr/sbin/named -u named -t /var/named/chroot

The service no longer responds to queries or rndc commands as the inet ports are no longer active.

Only way to restart the service is to force kill the named process

[root@machine ~]# kill -9 6233
[root@machine ~]# /etc/init.d/named restart
Stopping named:                                            [  OK  ]
Starting named:                                            [  OK  ]

Once done everything remounts properly and the service works properly again.

Redoing [root@machine ~]# rndc querylog

A quick dig check shows the query now logging correctly to the log file

dig @127.0.0.1 -t a www.redhat.com

From /var/log/messages:

Dec 18 04:42:33 machine named[25513]: client 127.0.0.1#36785: query: www.redhat.com IN A + (127.0.0.1)


I can only replicate this issue after the logrotations occur on Sunday morning. I'm assuming there's an issue with the chroot mounting of /var causing this lock bug.


Version-Release number of selected component (if applicable):

bind-utils-9.7.3-8.P3.el6.x86_64
bind-libs-9.7.3-8.P3.el6.x86_64
bind-9.7.3-8.P3.el6.x86_64
bind-chroot-9.7.3-8.P3.el6.x86_64
bind-devel-9.7.3-8.P3.el6.x86_64

Comment 2 Adam Tkac 2011-12-19 09:39:33 UTC
It seems this bug is mix of two issues.

-----
1. named stopped to write log messages:

I see you run named in chroot. May I ask you if you have rsyslogd configured correctly, please?

$ cat /etc/sysconfig/named
...
#       Don't forget to add "$AddUnixListenSocket /var/named/chroot/dev/log"
#       line to your /etc/rsyslog.conf file. Otherwise your logging becomes
#       broken when rsyslogd daemon is restarted (due update, for example).
...

-----
2. named hangs on shutdown

We will release update which fixes hang during shutdown soon. May I ask you to obtain backtrace when named hangs to confirm that the update fixes your issue as well, please?

1. install bind-debuginfo
2. `gdb attach <named_pid>`
3. inside gdb session run `t a a bt`
4. attach full output
-----

Thank you in advance!

Comment 3 Daniel McNamara 2011-12-19 09:51:28 UTC
Oops,

Seems I entirely missed that rsyslog configuration option. Adding that to the servers now and will check again after next rotation to make sure. For now it would indeed seem to be my misconfiguration mistake.

My apologies!

Comment 5 Adam Tkac 2011-12-20 10:02:24 UTC
We released bind update which should fix this issue yesterday:

http://rhn.redhat.com/errata/RHBA-2011-1836.html

Can you please update to bind-9.7.3-8.P3.el6_2.1 and check if named still hangs on shutdown? If yes, please attach backtrace as written in comment #2.

Thank you in advance!

Comment 6 Daniel McNamara 2011-12-20 10:14:43 UTC
The updated RPMs have been pushed to the servers. Should the hang issue reoccur within the next few days I will attach a backtrace as requested

Comment 7 Ondrej Vasik 2012-01-23 09:45:14 UTC
Any news here? Was that "bind hangs" issue fixed by the RHBA-2011-1836 update? TIA for answer.

Comment 8 Daniel McNamara 2012-01-23 09:53:02 UTC
Confirm this issue has not reappeared since RHBA-2011-1836 push

Comment 9 Adam Tkac 2012-01-23 10:15:31 UTC
(In reply to comment #8)
> Confirm this issue has not reappeared since RHBA-2011-1836 push

Thanks for feedback, closing.

This issue was fixed by the http://rhn.redhat.com/errata/RHBA-2011-1836.html update.


Note You need to log in before you can comment on or make changes to this bug.