RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 850944 - "service dnsmasq restart (or dnsmasq package update) kills all instances of dnsmasq on system, including those started by libvirtd
Summary: "service dnsmasq restart (or dnsmasq package update) kills all instances of d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: dnsmasq
Version: 6.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Tomáš Hozza
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 895654
TreeView+ depends on / blocked
 
Reported: 2012-08-22 20:22 UTC by Laine Stump
Modified: 2017-09-30 03:07 UTC (History)
13 users (show)

Fixed In Version: dnsmasq-2.48-13.el6
Doc Type: Bug Fix
Doc Text:
Cause: dnsmasq init script contained a mistake in "stop", "restart" and "condrestart" commands. These commands used "pidof" binary instead of "pidfileofproc" function in IF statement. Consequence: If there were some dnsmasq instances running besides the system one started by initscript, then repeated calling of "service dnsmasq stop/restart" would kill all running dnsmasq instances. Also ones not started with the initscript. Fix: Replacing "pidof" binary with "pidfileofproc" function in IF statement of dnsmasq init script "stop", "restart" and "condrestart" commands. Result: If there are some dnsmasq instances running besides the system one started by initscript, then by calling "service dnsmasq stop/restart" only the system one is stopped/restarted.
Clone Of:
Environment:
Last Closed: 2013-02-21 10:44:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
fix dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances (1.07 KB, patch)
2012-11-13 18:51 UTC, Laine Stump
no flags Details | Diff
[PATCH] fix dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances (1.37 KB, patch)
2012-11-19 15:33 UTC, Tomáš Hozza
no flags Details | Diff
[PATCH] fix dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances (1.30 KB, patch)
2012-12-19 15:02 UTC, Tomáš Hozza
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0277 0 normal SHIPPED_LIVE Moderate: dnsmasq security, bug fix and enhancement update 2013-02-20 21:28:47 UTC

Description Laine Stump 2012-08-22 20:22:24 UTC
Description of problem:

On a system with the dnsmasq service disabled, if you run "service dnsmasq restart" all instances of dnsmasq started by libvirtd are killed.

Version-Release number of selected component (if applicable):

dnsmasq-2.48-6.el6.x86_64

How reproducible: 100%



Steps to Reproduce:
1. install libvirt and start libvirtd
2. ps -AlF | grep dnsmasq - notice the big long dnsmasq commandline
3. service dnsmasq restart
4). ps -AlF | grep dnsmasq - notice the big long dnsmasq command is GONE!
  
This happens because /etc/init.d/dnsmasq calls the killproc() function (from /etc/init.d/functions) with just the name of the process, i.e. "killlproc dnsmasq".

Instead, it should find the pidfile of the system instance of dnsmasq, and use "killproc -p ${pidfile}

This is also a problem in Fedora 16 and older. It's not an issue in F17 and newer, since they use systemd, and the service initialization is being handled differently.

Bug 850606 is reporting the same behavior on F16.

Comment 2 Laine Stump 2012-11-13 18:47:35 UTC
It's worse than the initial description protrays - it turns out that if the dnsmasq package is updated (e.g. by "yum update"), all libvirt-initiated dnsmasq processes are summarily killed, and a system-wide dnsmasq process using the "default" dnsmasq configuration (which listens on all interfaces) is started. This prevents libvirt networks from being restarted (their dnsmasq fails to start because it finds the dhcp/dns ports on the interface "busy").

The reason for this is similar to the initial description - the dnsmasq specfile's %post script calls "service dnsmasq condrestart" which is supposed to restart the dnsmasq service *only if it is already running*. But instead of checking for a system-service initiated dnsmasq process, dnsmasq's initscript simply calls "pidof dnsmasq" (which looks for *any* dnsmasq processes), sees that there is at least one, and uses that as an indication to "killproc dnsmasq" then start the system instance of dnsmasq.

What dnsmasq's initscript *should* do is use "pidfileofproc dnsmasq" instead of "pidof dnsmasq" *in all cases*, and terminate/restart dnsmasq only if the system-service instance of dnsmasq is runnning.

I found a report of this same bug against Fedora 11/12 all the way back in December of 2009, with a (partial) patch included: Bug 547605.

I just made and tested a complete patch (and am attaching it to this bug) for the current RHEL6 dnsmasq initscript file which fixes all the problems outlined in this bug:

1) "service dnsmasq stop|restart|condrestart" will only stop dnsmasq processes
   that were originally started by the dnsmasq initscript.

2) "service dnsmasq condrestart" will only start the system dnsmasq if an initscript-initiated dnsmasq was already running

3) "service dnsmasq reload" will only send SIGHUP to the dnsmasq process started
   by the dnsmasq initscript.

To prevent unexpected disruption to libvirt users during an update, this patch should at the very least be included with the next update of dnsmasq, as it puts libvirt's networks in a state that requires undocumented manual intervention (or a host reboot) to escape from.

Comment 3 Laine Stump 2012-11-13 18:51:36 UTC
Created attachment 644338 [details]
fix  dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances

Comment 4 Tomáš Hozza 2012-11-19 15:33:01 UTC
Created attachment 647838 [details]
[PATCH] fix dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances

Corrected version of the initial patch.

Comment 5 Laine Stump 2012-12-18 16:32:33 UTC
Since there will now be a dnsmasq update for RHEL6.4, this patch should be included (see the final paragraph of Comment 2)

Comment 6 Laine Stump 2012-12-18 16:48:50 UTC
Also, I'm not sure what the reason was for your change to the original patch I posted. "pidfileofproc" isn't a binary in /sbin, it is a shell function that is defined in /etc/init.d/functions (sourced right at the beginning of /etc/init.d/dnsmasq), and anyway you only changed one instance of that name in the patch, leaving the other 4 occurences untouched. I believe the original patch is the correct patch to apply.

Comment 7 Tomáš Hozza 2012-12-19 08:04:43 UTC
(In reply to comment #6)
> Also, I'm not sure what the reason was for your change to the original patch
> I posted. "pidfileofproc" isn't a binary in /sbin, it is a shell function
> that is defined in /etc/init.d/functions (sourced right at the beginning of
> /etc/init.d/dnsmasq), and anyway you only changed one instance of that name
> in the patch, leaving the other 4 occurences untouched. I believe the
> original patch is the correct patch to apply.

Yes, you are right about saying that "pidfileofproc" isn't a binary in /sbin.

In the *original patch* you used:
-	    if test "x`/sbin/pidof dnsmasq`" != x; then
+	    if test "x`/sbin/pidfileofproc dnsmasq`" != x; then

Since "pidfileofproc" isn't a binary in /sbin (as I know and as you stated in Comment #6) I corrected it to:
-	    if test "x`/sbin/pidof dnsmasq`" != x; then
+	    if test "x`pidfileofproc dnsmasq`" != x; then

Comment 9 RHEL Program Management 2012-12-19 11:01:32 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 11 Laine Stump 2012-12-19 12:54:06 UTC
Ah, right you are! I was looking at the patchfiles backwards for some reason. Sorry for the noise :-/

Comment 12 Tomáš Hozza 2012-12-19 15:02:41 UTC
Created attachment 666124 [details]
[PATCH] fix dnsmasq initscript to only recognize/act on self-initiated dnsmasq instances

This is the final patch I used. I changed "kill" back to "killproc" after consulting with colleague that maintains init scripts. killproc does much more than just kill. And also the initscript output was not good with kill. There was no [ OK ] or [ FAILED ].

After inspection the main problem was in the condition in "stop" part of initscript:

-        if test "x`pidof dnsmasq`" != x; then
+        if test "x`pidfileofproc dnsmasq`" != x; then

I tested the final patch and everything worked as should. Calling "service dnsmasq stop" killed just the system instantion with PID in /var/run/dnsmasq.pid

Comment 15 Laine Stump 2012-12-19 16:16:39 UTC
Ah, now I have a better understanding of what was causing libvirt's dnsmasq processes to be erroneously killed - it's not the mere act of using "killproc dnsmasq", but in doing that when there is currently no running system instance of dnsmasq.

If there is a system instance of dnsmasq running, killproc will kill only that process. But if no system instance can be found, "killproc dnsmasq" behaves identically to "killall dnsmasq", so all of libvirt's dnsmasq instances are killed.

I've tested dnsmasq-2.48-12 on a system with libvirt dnsmasq instances running, and all start/stop/restart operations are now behaving properly. Thanks!

Comment 17 errata-xmlrpc 2013-02-21 10:44:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0277.html


Note You need to log in before you can comment on or make changes to this bug.