Bug 154335 - named init script relies on rndc to stop
named init script relies on rndc to stop
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: bind (Show other bugs)
3
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jason Vas Dias
Ben Levenson
:
: 163598 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-04-10 05:26 EDT by Allen Kistler
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version: 9.2.5-3
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-08-19 06:14:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch against named init for stop action (642 bytes, patch)
2005-04-10 05:26 EDT, Allen Kistler
no flags Details | Diff
Patch against FC3 named init (1.15 KB, patch)
2005-05-21 22:12 EDT, Allen Kistler
no flags Details | Diff

  None (edit)
Description Allen Kistler 2005-04-10 05:26:16 EDT
Description of problem:
Because the init script relies on rndc to stop named, if the control port is
closed, named cannot be stopped.

Version-Release number of selected component (if applicable):
bind-9.2.5-1

How reproducible:
Always

Steps to Reproduce:
1. Enter 'controls { };' in named.conf
2. /etc/init.d/named start
3. /etc/init.d/named stop
  
Actual results:
Script prints "Stopping named:" but neither prints "[OK]" or actually stops the
process.

Expected results:
"Stopping named: [OK]" and "ps -ef | grep named" should verify named is not running.

Additional info:
The init script contains the lines
       /usr/sbin/rndc stop >/dev/null 2>&1
        RETVAL=$?
        [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/named || {
#               killproc named
#               Never do this! Can cause corrupt zone files!

The comment is true (as far as it goes).  killproc firsts tries SIGTERM, which
named ignores, then tries SIGKILL.  named will gracefully exit, however, if it
is sent a SIGINT or a SIGQUIT.  This behavior is documented in the man page.

I suggest replacing
  /usr/sbin/rndc stop >/dev/null 2>&1
with
  killproc named -QUIT
to terminate named gracefully without rndc.  I have attached a diff.

I suspect the same is true of the FC4-test script.  Plus other actions than stop
also rely on rndc.
Comment 1 Allen Kistler 2005-04-10 05:26:17 EDT
Created attachment 112919 [details]
Patch against named init for stop action
Comment 2 Jason Vas Dias 2005-04-10 14:46:44 EDT
Running named without rndc control is not a supported configuration.
Many named features (eg. cache flush, cache dump, statistics, hot reload...)
are completely disabled without rndc control.
I will put in a fix to prevent use of rndc anywhere in the initscript
with a flag in /etc/sysconfig/named such as 'NO_RNDC=yes', but this
should not be the default.
Comment 3 Jason Vas Dias 2005-05-17 22:36:04 EDT
This is now fixed with bind-9.3.1-4 in rawhide-2005-05-18 / FC4 -
named.init now falls back on using 'killproc -TERM named' if 
rndc stop fails. Will update bind-9.2.5 for FC3 shortly. 

Comment 4 Allen Kistler 2005-05-21 22:12:42 EDT
Created attachment 114681 [details]
Patch against FC3 named init

Changes based on FC4-test update
Comment 5 Allen Kistler 2005-05-21 22:14:32 EDT
Hmm...  Don't know why I said QUIT in the o.p.  Must have been a long
night....

I got the script from the FC4 rpm and tried the appropriate
differences on the FC3 script.  It appears to work okay.

I've attached a patch against the FC3 script for what I tested.
Comment 6 Trevor Cordes 2005-07-14 05:45:14 EDT
The fix appears to introduce a bug on my system.  If I "service named stop" it
reports and error, named DOES stop running, and the stale pid file doesn't get
rm'd.  If I "service named start" then it starts but it complains about existing
pid file.

I traced the problem to /etc/init.d/named in the stop() function.  $? appears to
get the value 1 no matter whether the rndc stop fails or succeeds.  I don't know
enough bash quirks, but the || && exceptions you guys put on the rndc stop line
appear to alter the $? value before RETVAL=$? gets assigned, causing RETVAL to
always be non-zero.

I'm sure I could patch this with some more lines and temp vars, but you bsh
wizards can do a better job.  Strange how no one else has reported this yet.

bind-9.2.5-2
Comment 7 Jason Vas Dias 2005-07-14 10:57:31 EDT
I'm not seeing any problems like this, and other users that run with rndc
disabled have not reported any such problems .

Please check that you've upgraded to the latest bash-3.0.18 FC3 version.

When you start named, do you see log messages like 
  "named[...]: couldn't add command channel"
(ie. is rndc disabled on your system )?
I've tried starting/stopping/restarting named with rndc enabled and disabled,
and cannot reproduce this problem. 

It would be most helpful in enabling me to reproduce this problem if I could
see your configuration files - please tar them up:
  # tar -cphf /tmp/named.conf.tar /etc/{named.conf,/etc/rndc.key,/etc/rndc.conf}
and either append /tmp/named.conf.tar to this bug report or email it to me:
jvdias@redhat.com.

In the next version, I will edit the named.init file for clarity:
--- /etc/named.init, line 109:
stop() {
        # Stop daemons.
        echo -n $"Stopping $prog: "
	/usr/sbin/rndc stop >/dev/null 2>&1
	RETVAL=$?
	if pidof named >/dev/null; then
	   killproc named -TERM >/dev/null 2>&1 
	   $RETVAL=$?;
	fi
	if [ $RETVAL -eq 0 ]; then
	    rm -f /var/lock/subsys/named
	    rm -f /var/run/named.pid	    
	elif  pidof named >/dev/null; then
	    /usr/sbin/rndc stop >/dev/null 2>&1
	    RETVAL=$?
	    if pidof named >/dev/null; then
		killproc named -TERM >/dev/null 2>&1 
		$RETVAL=$?;
	    fi
	    if [ $RETVAL -eq 0 ]; then
		rm -f /var/lock/subsys/named
		rm -f /var/run/named.pid
	    fi;
	fi;
	if [ $RETVAL -eq 0 ]; then
	    success
	else
	    failure
        fi;
	echo
	return $RETVAL
}
Comment 8 Trevor Cordes 2005-07-14 12:19:51 EDT
I have all the latest errata, but an older kernel 2.6.10-1.770_FC3 (SCSI bugs
keeping me from moving up so far).  Bash is bash-3.0-18.

rndc is enabled and tested working.  If I issue the /usr/sbin/rndc stop manually
it works fine (exit 0) and named is stopped.  I edited the rndc stop line in
/etc/init.d/named to read just:

"/usr/sbin/rndc stop >/dev/null 2>&1"

and now the script works 100% as expected.  Splitting up the command and echoing
$? and RETVAL for debugging I can easily show that when the || &&'s are in there
the RETVAL is always non-zero EVEN THOUGH rndc stop is succeeding!  With the
||&& taken out the retval is as expected.

It totally appears to me like some simple bash semantics problem.  I can't
understand why it would be unique to my system.

For testing, I just changed the line to read:

/usr/sbin/rndc stop >/dev/null 2>&1 || echo foo && echo bar

and now I get:

#service named stop
Stopping named: bar
                                                           [  OK  ]
#service named start
Starting named:                                            [  OK  ]

So it looks like the logic in your short-circuit || is incorrect.  Does bash do
C-style order of ops for || && short circuit?  Looks like you want something
that explicit parens in C would provide:  a || (b && c)

Like I said, I don't know bash well enough to know how it handles that.  Your
updated script should fix the problem fine, though I'd probably test RETVAL
before "if pidof named" as perhaps rndc can return before named is truly dead
(otherwise why the sleep 2 in the restart() function?).
Comment 9 Robert Nichols 2005-07-18 12:30:30 EDT
The problem is incorrect logic in the script line that reverts to killproc() if
rndc fails (line break added to avoid word wrap):

   /usr/sbin/rndc stop >/dev/null 2>&1 || pidof named >/dev/null && \
        killproc named -TERM >/dev/null 2>&1
   RETVAL=$?

The killproc call is executed if rndc succeeds OR if a process "named" exists. 
If rndc succeeds, pidof is never called, the named process no longer exists
and killproc fails, resulting in RETVAL=1.

I believe the script line should be:

   /usr/sbin/rndc stop >/dev/null 2>&1 || { pidof named >/dev/null && \
        killproc named -TERM >/dev/null 2>&1; }

which will execute the "pidof" and "killproc" commands only if rndc fails.

This really should be cleaned up, since this problem causes "service named
restart" to complain "ln: `/var/run/named.pid': File exists" .
Comment 10 Trevor Cordes 2005-07-18 12:33:32 EDT
Thought so.  Exactly my point.  You sure the solution is curly braces instead of
parens?  (As I said, I'm no bash expert, but at least this is an easy one to fix.)
Comment 11 Robert Nichols 2005-07-18 13:04:39 EDT
Parentheses would work too, but at the expense of spawning a subprocess.
The curly braces just group the commands within the current process.
Comment 12 Jason Vas Dias 2005-07-19 12:52:47 EDT
This bug is now fixed with bind-9.2.5-3 .
Comment 13 Andy Shevchenko 2005-07-20 05:13:50 EDT
*** Bug 163598 has been marked as a duplicate of this bug. ***
Comment 14 Walter Justen 2005-08-19 06:14:23 EDT
update package is published

Note You need to log in before you can comment on or make changes to this bug.