Bug 154335

Summary: named init script relies on rndc to stop
Product: [Fedora] Fedora Reporter: Allen Kistler <ackistler>
Component: bindAssignee: Jason Vas Dias <jvdias>
Status: CLOSED CURRENTRELEASE QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: andy, rnichols42, trevor
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 9.2.5-3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-08-19 10:14:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch against named init for stop action
none
Patch against FC3 named init none

Description Allen Kistler 2005-04-10 09:26:16 UTC
Description of problem:
Because the init script relies on rndc to stop named, if the control port is
closed, named cannot be stopped.

Version-Release number of selected component (if applicable):
bind-9.2.5-1

How reproducible:
Always

Steps to Reproduce:
1. Enter 'controls { };' in named.conf
2. /etc/init.d/named start
3. /etc/init.d/named stop
  
Actual results:
Script prints "Stopping named:" but neither prints "[OK]" or actually stops the
process.

Expected results:
"Stopping named: [OK]" and "ps -ef | grep named" should verify named is not running.

Additional info:
The init script contains the lines
       /usr/sbin/rndc stop >/dev/null 2>&1
        RETVAL=$?
        [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/named || {
#               killproc named
#               Never do this! Can cause corrupt zone files!

The comment is true (as far as it goes).  killproc firsts tries SIGTERM, which
named ignores, then tries SIGKILL.  named will gracefully exit, however, if it
is sent a SIGINT or a SIGQUIT.  This behavior is documented in the man page.

I suggest replacing
  /usr/sbin/rndc stop >/dev/null 2>&1
with
  killproc named -QUIT
to terminate named gracefully without rndc.  I have attached a diff.

I suspect the same is true of the FC4-test script.  Plus other actions than stop
also rely on rndc.

Comment 1 Allen Kistler 2005-04-10 09:26:17 UTC
Created attachment 112919 [details]
Patch against named init for stop action

Comment 2 Jason Vas Dias 2005-04-10 18:46:44 UTC
Running named without rndc control is not a supported configuration.
Many named features (eg. cache flush, cache dump, statistics, hot reload...)
are completely disabled without rndc control.
I will put in a fix to prevent use of rndc anywhere in the initscript
with a flag in /etc/sysconfig/named such as 'NO_RNDC=yes', but this
should not be the default.


Comment 3 Jason Vas Dias 2005-05-18 02:36:04 UTC
This is now fixed with bind-9.3.1-4 in rawhide-2005-05-18 / FC4 -
named.init now falls back on using 'killproc -TERM named' if 
rndc stop fails. Will update bind-9.2.5 for FC3 shortly. 



Comment 4 Allen Kistler 2005-05-22 02:12:42 UTC
Created attachment 114681 [details]
Patch against FC3 named init

Changes based on FC4-test update

Comment 5 Allen Kistler 2005-05-22 02:14:32 UTC
Hmm...  Don't know why I said QUIT in the o.p.  Must have been a long
night....

I got the script from the FC4 rpm and tried the appropriate
differences on the FC3 script.  It appears to work okay.

I've attached a patch against the FC3 script for what I tested.

Comment 6 Trevor Cordes 2005-07-14 09:45:14 UTC
The fix appears to introduce a bug on my system.  If I "service named stop" it
reports and error, named DOES stop running, and the stale pid file doesn't get
rm'd.  If I "service named start" then it starts but it complains about existing
pid file.

I traced the problem to /etc/init.d/named in the stop() function.  $? appears to
get the value 1 no matter whether the rndc stop fails or succeeds.  I don't know
enough bash quirks, but the || && exceptions you guys put on the rndc stop line
appear to alter the $? value before RETVAL=$? gets assigned, causing RETVAL to
always be non-zero.

I'm sure I could patch this with some more lines and temp vars, but you bsh
wizards can do a better job.  Strange how no one else has reported this yet.

bind-9.2.5-2


Comment 7 Jason Vas Dias 2005-07-14 14:57:31 UTC
I'm not seeing any problems like this, and other users that run with rndc
disabled have not reported any such problems .

Please check that you've upgraded to the latest bash-3.0.18 FC3 version.

When you start named, do you see log messages like 
  "named[...]: couldn't add command channel"
(ie. is rndc disabled on your system )?
I've tried starting/stopping/restarting named with rndc enabled and disabled,
and cannot reproduce this problem. 

It would be most helpful in enabling me to reproduce this problem if I could
see your configuration files - please tar them up:
  # tar -cphf /tmp/named.conf.tar /etc/{named.conf,/etc/rndc.key,/etc/rndc.conf}
and either append /tmp/named.conf.tar to this bug report or email it to me:
jvdias.

In the next version, I will edit the named.init file for clarity:
--- /etc/named.init, line 109:
stop() {
        # Stop daemons.
        echo -n $"Stopping $prog: "
	/usr/sbin/rndc stop >/dev/null 2>&1
	RETVAL=$?
	if pidof named >/dev/null; then
	   killproc named -TERM >/dev/null 2>&1 
	   $RETVAL=$?;
	fi
	if [ $RETVAL -eq 0 ]; then
	    rm -f /var/lock/subsys/named
	    rm -f /var/run/named.pid	    
	elif  pidof named >/dev/null; then
	    /usr/sbin/rndc stop >/dev/null 2>&1
	    RETVAL=$?
	    if pidof named >/dev/null; then
		killproc named -TERM >/dev/null 2>&1 
		$RETVAL=$?;
	    fi
	    if [ $RETVAL -eq 0 ]; then
		rm -f /var/lock/subsys/named
		rm -f /var/run/named.pid
	    fi;
	fi;
	if [ $RETVAL -eq 0 ]; then
	    success
	else
	    failure
        fi;
	echo
	return $RETVAL
}


Comment 8 Trevor Cordes 2005-07-14 16:19:51 UTC
I have all the latest errata, but an older kernel 2.6.10-1.770_FC3 (SCSI bugs
keeping me from moving up so far).  Bash is bash-3.0-18.

rndc is enabled and tested working.  If I issue the /usr/sbin/rndc stop manually
it works fine (exit 0) and named is stopped.  I edited the rndc stop line in
/etc/init.d/named to read just:

"/usr/sbin/rndc stop >/dev/null 2>&1"

and now the script works 100% as expected.  Splitting up the command and echoing
$? and RETVAL for debugging I can easily show that when the || &&'s are in there
the RETVAL is always non-zero EVEN THOUGH rndc stop is succeeding!  With the
||&& taken out the retval is as expected.

It totally appears to me like some simple bash semantics problem.  I can't
understand why it would be unique to my system.

For testing, I just changed the line to read:

/usr/sbin/rndc stop >/dev/null 2>&1 || echo foo && echo bar

and now I get:

#service named stop
Stopping named: bar
                                                           [  OK  ]
#service named start
Starting named:                                            [  OK  ]

So it looks like the logic in your short-circuit || is incorrect.  Does bash do
C-style order of ops for || && short circuit?  Looks like you want something
that explicit parens in C would provide:  a || (b && c)

Like I said, I don't know bash well enough to know how it handles that.  Your
updated script should fix the problem fine, though I'd probably test RETVAL
before "if pidof named" as perhaps rndc can return before named is truly dead
(otherwise why the sleep 2 in the restart() function?).


Comment 9 Robert Nichols 2005-07-18 16:30:30 UTC
The problem is incorrect logic in the script line that reverts to killproc() if
rndc fails (line break added to avoid word wrap):

   /usr/sbin/rndc stop >/dev/null 2>&1 || pidof named >/dev/null && \
        killproc named -TERM >/dev/null 2>&1
   RETVAL=$?

The killproc call is executed if rndc succeeds OR if a process "named" exists. 
If rndc succeeds, pidof is never called, the named process no longer exists
and killproc fails, resulting in RETVAL=1.

I believe the script line should be:

   /usr/sbin/rndc stop >/dev/null 2>&1 || { pidof named >/dev/null && \
        killproc named -TERM >/dev/null 2>&1; }

which will execute the "pidof" and "killproc" commands only if rndc fails.

This really should be cleaned up, since this problem causes "service named
restart" to complain "ln: `/var/run/named.pid': File exists" .

Comment 10 Trevor Cordes 2005-07-18 16:33:32 UTC
Thought so.  Exactly my point.  You sure the solution is curly braces instead of
parens?  (As I said, I'm no bash expert, but at least this is an easy one to fix.)

Comment 11 Robert Nichols 2005-07-18 17:04:39 UTC
Parentheses would work too, but at the expense of spawning a subprocess.
The curly braces just group the commands within the current process.

Comment 12 Jason Vas Dias 2005-07-19 16:52:47 UTC
This bug is now fixed with bind-9.2.5-3 .

Comment 13 Andy Shevchenko 2005-07-20 09:13:50 UTC
*** Bug 163598 has been marked as a duplicate of this bug. ***

Comment 14 Walter Justen 2005-08-19 10:14:23 UTC
update package is published