Bug 1502081 - dnsmasq fails to start on boot
Summary: dnsmasq fails to start on boot
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: dnsmasq
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Petr Menšík
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-14 00:10 UTC by Louis van Dyk
Modified: 2021-07-21 19:48 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-11-15 15:15:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Debian BTS 774970 0 None None None 2017-10-31 11:10:44 UTC
Launchpad 1531184 0 None None None 2017-10-31 11:12:58 UTC

Description Louis van Dyk 2017-10-14 00:10:56 UTC
Description of problem:
When booting up, dnsmasq fails to start because of the error: 
 unknown interface enp3s0
It seems that NetworkManager has not brought the Ethernet interface up in time, so it fails to start.


Version-Release number of selected component (if applicable):
NetworkManager-1.8.2-1.fc26.x86_64
dnsmasq-2.76-5.fc26.x86_64


How reproducible:
It happens every time I reboot.  I have to start dnsmasq manually after logging in.


Steps to Reproduce:
1. Reboot
2. dnsmasq fails to start
3. login and start the service manually


Actual results:

[root@fedora ~]# systemctl status dnsmasq 
● dnsmasq.service - DNS caching server.
   Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2017-10-14 01:21:07 SAST; 14min ago
  Process: 1235 ExecStart=/usr/sbin/dnsmasq -k (code=exited, status=2)
 Main PID: 1235 (code=exited, status=2)

Oct 14 01:21:04 fedora.localdomain systemd[1]: Started DNS caching server..
Oct 14 01:21:07 fedora.localdomain dnsmasq[1235]: dnsmasq: unknown interface enp3s0
Oct 14 01:21:07 fedora.localdomain systemd[1]: dnsmasq.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 14 01:21:07 fedora.localdomain systemd[1]: dnsmasq.service: Unit entered failed state.
Oct 14 01:21:07 fedora.localdomain systemd[1]: dnsmasq.service: Failed with result 'exit-code'.

[root@fedora ~]# systemctl start dnsmasq 

[root@fedora ~]# systemctl status NetworkManager-wait-online
● NetworkManager-wait-online.service - Network Manager Wait Online
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2017-10-14 01:21:42 SAST; 15min ago
     Docs: man:nm-online(1)
  Process: 1212 ExecStart=/usr/bin/nm-online -s -q --timeout=60 (code=exited, status=2)
 Main PID: 1212 (code=exited, status=2)

Oct 14 01:21:03 fedora.localdomain systemd[1]: Starting Network Manager Wait Online...
Oct 14 01:21:29 fedora.localdomain nm-online[1212]: Error: Could not create NMClient object: Timeout was reached
Oct 14 01:21:42 fedora.localdomain systemd[1]: NetworkManager-wait-online.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 14 01:21:42 fedora.localdomain systemd[1]: Failed to start Network Manager Wait Online.
Oct 14 01:21:42 fedora.localdomain systemd[1]: NetworkManager-wait-online.service: Unit entered failed state.
Oct 14 01:21:42 fedora.localdomain systemd[1]: NetworkManager-wait-online.service: Failed with result 'exit-code'.

[root@fedora ~]# cat /usr/lib/systemd/system/dnsmasq.service
[Unit]
Description=DNS caching server.
After=network.target

[Service]
ExecStart=/usr/sbin/dnsmasq -k

[Install]
WantedBy=multi-user.target

[root@fedora ~]# cat /usr/lib/systemd/system/NetworkManager-wait-online.service
[Unit]
Description=Network Manager Wait Online
Documentation=man:nm-online(1)
Requisite=NetworkManager.service
After=NetworkManager.service
Before=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/bin/nm-online -s -q --timeout=60
RemainAfterExit=yes

[Install]
WantedBy=network-online.target

[root@fedora ~]# grep enp /etc/dnsmasq.conf 
interface=enp3s0


Expected results:
dnsmasq should start.  It depends on the ethernet interface being ready.  Surely there should be an easy way to ensure this?


Additional info:

Comment 1 Dusty Mabe 2017-10-16 18:38:53 UTC
i'm seeing something similar where dnsmasq service is not waiting long enough. I have `listen-address=x.x.x.x` in my config and I see 


```
Oct 16 17:59:10 origin-master-1.localdomain systemd[1]: Started DNS caching server..
Oct 16 17:59:10 origin-master-1.localdomain dnsmasq[812]: dnsmasq: failed to create listening socket for x.x.x.x: Cannot assign requested address
Oct 16 17:59:10 origin-master-1.localdomain dnsmasq[812]: failed to create listening socket for x.x.x.x: Cannot assign requested address
Oct 16 17:59:10 origin-master-1.localdomain systemd[1]: dnsmasq.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
```

if I restart dnsmasq after being booted for a while it works fine. I'll note I've seen this for
a while just haven't got around to reporting it. That means I've seen it on dnsmasq-2.76-3.fc26.x86_64
and the newer `2.76-5.fc26.x86_64`.

Comment 2 Dusty Mabe 2017-10-30 19:55:11 UTC
FYI: ubuntu bug for similar problem: https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/1531184

Petr, can we get you to look at this?

Comment 3 Petr Menšík 2017-10-31 11:10:44 UTC
Hi,

I think this issue is better solved by using bind-dynamic instead of bind-interfaces. network-online.target is not used intentionally. If system is configured to use dnsmasq as DNS resolver of the system, network-online.target may not be active until dnsmasq is ready. It must not wait for network-online.target by default.

# dnsmasq.conf
interface=eth0
bind-dynamic

allows start of dnsmasq even when eth0 is still down. As soon as it is up, it will bind the interface and listen on it. I think this is exactly what is required in this case.

Is there reason why is bind-dynamic not suitable for your case? I would close the bug if it helps.

Comment 4 Dusty Mabe 2017-10-31 12:35:03 UTC
Thanks Petr. I'm going to work in an upstream issue [1] to see if we can use bind-dynamic.

[1] https://github.com/openshift/openshift-ansible/issues/5935

In the meantime, do you have any influence on dnsmasq upstream? It would be great if upstream could be modified to add a helpful message during boot. i.e. if I can't bind to eth0 when dnsmasq is started and bind-dynamic is not in the config then give a message to the user that says 'couldn't bind to address on eth0, maybe you would like to use bind-dynamic in dnsmasq config'.

Comment 5 Louis van Dyk 2017-11-10 16:16:59 UTC
(In reply to Petr Menšík from comment #3)
> Hi,
> 
> I think this issue is better solved by using bind-dynamic instead of
> bind-interfaces. network-online.target is not used intentionally. If system
> is configured to use dnsmasq as DNS resolver of the system,
> network-online.target may not be active until dnsmasq is ready. It must not
> wait for network-online.target by default.
> 
> # dnsmasq.conf
> interface=eth0
> bind-dynamic
> 
> allows start of dnsmasq even when eth0 is still down. As soon as it is up,
> it will bind the interface and listen on it. I think this is exactly what is
> required in this case.
> 
> Is there reason why is bind-dynamic not suitable for your case? I would
> close the bug if it helps.


Hi Petr

FANTASTIC!!  Using "bind-dynamic" works for me.

To answer your question: the reason I didn't use bind-dynamic is because I didn't know it existed!!  It's not mentioned anywhere in the comments of the sample dnsmasq.conf file.  I would suggest that it be added, as it's most useful in my case.

Many thanks.

Comment 6 Petr Menšík 2017-11-15 15:15:05 UTC
Closing the bug, can be fixed by already supported configuration.

bind-dynamic is supported only on Linux, bind-interfaces on any other platform. I think default configuration and manual page are place for improvement. I guess code is not best place to document behaviour. Especially if failed bind to interface might be because incorrectly spelled interface name.

Comment 7 Matthew Woehlke 2021-07-21 19:48:40 UTC
*** Bug 1984618 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.