Bug 1560223 - Regression: unbound unable to resolve when new WLAN connection is made
Summary: Regression: unbound unable to resolve when new WLAN connection is made
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: unbound
Version: 27
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Paul Wouters
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-25 00:38 UTC by Dimitris
Modified: 2018-06-11 17:40 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-06-11 17:40:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
unbound.conf under 1.6.8 (35.06 KB, text/plain)
2018-03-26 05:51 UTC, Dimitris
no flags Details
unbound.conf under 1.7.0 (36.22 KB, text/plain)
2018-03-26 05:52 UTC, Dimitris
no flags Details
unbound.conf under 1.7.0 with aggressive-nsec: yes and auth-zone: removed (36.23 KB, text/plain)
2018-03-27 14:39 UTC, Charles R. Anderson
no flags Details

Description Dimitris 2018-03-25 00:38:12 UTC
Description of problem:
Whenever I (re)connect to a WLAN (reboot, manual disconnect/reconnect, resume from suspend) unbound is unable to resolve addresses.  I have to manually issue a reload command before it starts resolving again.

Version-Release number of selected component (if applicable):
Regression starts with 1.7.0-2.fc27

How reproducible:
Every time

Steps to Reproduce:

1. /etc/NetworkManager/NetworkManager.conf specifies dns=unbound

2. From disconnected state, connect to WLAN.

3. 'dig www.google.com' results in no resolution:
; <<>> DiG 9.11.3-RedHat-9.11.3-2.fc27 <<>> www.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39856
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 27

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.google.com.			IN	A

;; AUTHORITY SECTION:
com.			172615	IN	NS	l.gtld-servers.net.
com.			172615	IN	NS	m.gtld-servers.net.
com.			172615	IN	NS	a.gtld-servers.net.
com.			172615	IN	NS	b.gtld-servers.net.
com.			172615	IN	NS	c.gtld-servers.net.
com.			172615	IN	NS	d.gtld-servers.net.
com.			172615	IN	NS	e.gtld-servers.net.
com.			172615	IN	NS	f.gtld-servers.net.
com.			172615	IN	NS	g.gtld-servers.net.
com.			172615	IN	NS	h.gtld-servers.net.
com.			172615	IN	NS	i.gtld-servers.net.
com.			172615	IN	NS	j.gtld-servers.net.
com.			172615	IN	NS	k.gtld-servers.net.

;; ADDITIONAL SECTION:
a.gtld-servers.net.	172615	IN	A	192.5.6.30
a.gtld-servers.net.	172615	IN	AAAA	2001:503:a83e::2:30
b.gtld-servers.net.	172615	IN	A	192.33.14.30
b.gtld-servers.net.	172615	IN	AAAA	2001:503:231d::2:30
c.gtld-servers.net.	172615	IN	A	192.26.92.30
c.gtld-servers.net.	172615	IN	AAAA	2001:503:83eb::30
d.gtld-servers.net.	172615	IN	A	192.31.80.30
d.gtld-servers.net.	172615	IN	AAAA	2001:500:856e::30
e.gtld-servers.net.	172615	IN	A	192.12.94.30
e.gtld-servers.net.	172615	IN	AAAA	2001:502:1ca1::30
f.gtld-servers.net.	172615	IN	A	192.35.51.30
f.gtld-servers.net.	172615	IN	AAAA	2001:503:d414::30
g.gtld-servers.net.	172615	IN	A	192.42.93.30
g.gtld-servers.net.	172615	IN	AAAA	2001:503:eea3::30
h.gtld-servers.net.	172615	IN	A	192.54.112.30
h.gtld-servers.net.	172615	IN	AAAA	2001:502:8cc::30
i.gtld-servers.net.	172615	IN	A	192.43.172.30
i.gtld-servers.net.	172615	IN	AAAA	2001:503:39c1::30
j.gtld-servers.net.	172615	IN	A	192.48.79.30
j.gtld-servers.net.	172615	IN	AAAA	2001:502:7094::30
k.gtld-servers.net.	172615	IN	A	192.52.178.30
k.gtld-servers.net.	172615	IN	AAAA	2001:503:d2d::30
l.gtld-servers.net.	172615	IN	A	192.41.162.30
l.gtld-servers.net.	172615	IN	AAAA	2001:500:d937::30
m.gtld-servers.net.	172615	IN	A	192.55.83.30
m.gtld-servers.net.	172615	IN	AAAA	2001:501:b1f9::30

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Mar 24 17:32:18 PDT 2018
;; MSG SIZE  rcvd: 839

4. At this point "systemctl status unbound.service" shows several entries like:
unbound[11585]: [11585:3] info: validation failure <domain> A IN
unbound[11585]: [11585:3] info: validation failure <domain> AAAA IN

5. After "sudo unbound-control reload", I can resolve names again, and systemctl status no longer shows validation failures.

Actual results:
Cannot resolve hostnames

Expected results:
Up until the previous version, currently in stable, name resolution worked across network changes without manual intervention.

Additional info:

Comment 1 Paul Wouters 2018-03-25 21:07:33 UTC
please as a workaround, try setting

aggressive-nsec: no

in unbound.conf

Comment 2 Charles R. Anderson 2018-03-26 05:17:02 UTC
I get this with dnssec-trigger:

Mar 26 01:00:16 gauge unbound-checkconf[1335]: unbound-checkconf: no errors in /etc/unbound/unbound.conf
Mar 26 01:00:16 gauge unbound-anchor[1351]: [1522040416] libunbound[1351:0] error: can't bind socket: Permission denied for 0.0.0.0
Mar 26 01:00:16 gauge audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=unbound comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] notice: init module 0: ipsecmod
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] notice: init module 1: validator
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] notice: init module 2: iterator
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] info: start of service (unbound 1.7.0).
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master k.root-servers.net
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master g.root-servers.net
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master f.root-servers.net
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master e.root-servers.net
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master c.root-servers.net
Mar 26 01:00:16 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master b.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master k.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master g.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master f.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master e.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master c.root-servers.net
Mar 26 01:00:19 gauge unbound[1372]: [1372:0] error: .: failed lookup, cannot probe to master b.root-servers.net
Mar 26 01:00:25 gauge unbound[1372]: [1372:1] info: generate keytag query _ta-4a5c-4f66. NULL IN
Mar 26 01:00:39 gauge unbound[1372]: [1372:1] info: generate keytag query _ta-4a5c-4f66. NULL IN

Comment 3 Dimitris 2018-03-26 05:50:51 UTC
Tried aggressive-nsec: no, didn't help.  Same steps as description.

Note for anyone trying this, save your old unbound.conf, as after distro-sync back to 1.6.8, unbound won't start due to encountering unknown stanzas in the config file that were introduced by 1.7.

I'm attaching my 1.7 and 1.6.8 config files.

Comment 4 Dimitris 2018-03-26 05:51:48 UTC
Created attachment 1412968 [details]
unbound.conf under 1.6.8

Comment 5 Dimitris 2018-03-26 05:52:20 UTC
Created attachment 1412969 [details]
unbound.conf under 1.7.0

Comment 6 Dimitris 2018-03-26 06:10:53 UTC
FWIW, also running dnssec-triggerd here, and seeing the same as Charles:

Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master k.root-servers.net
Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master g.root-servers.net
Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master f.root-servers.net
Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master e.root-servers.net
Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master c.root-servers.net
Mar 25 22:33:46 vimes unbound[1318]: [1318:0] error: .: failed lookup, cannot probe to master b.root-servers.net

Comment 7 Charles R. Anderson 2018-03-27 14:13:32 UTC
Given that the failures above are for exactly the same zones as listed in the 1.7.0 config as auth-zones, this seems to be the cause of the problem:

+# Authority zones
+# The data for these zones is kept locally, from a file or downloaded.
+# The data can be served to downstream clients, or used instead of the
+# upstream (which saves a lookup to the upstream).  The first example
+# has a copy of the root for local usage.  The second serves example.org
+# authoritatively.  zonefile: reads from file (and writes to it if you also
+# download it), master: fetches with AXFR and IXFR, or url to zonefile.
+auth-zone:
+       name: "."
+       for-downstream: no
+       for-upstream: yes
+       fallback-enabled: yes
+       master: b.root-servers.net
+       master: c.root-servers.net
+       master: e.root-servers.net
+       master: f.root-servers.net
+       master: g.root-servers.net
+       master: k.root-servers.net
+# auth-zone:
+#      name: "example.org"
+#      for-downstream: yes
+#      for-upstream: yes
+#      zonefile: "example.org.zone"

Comment 8 Charles R. Anderson 2018-03-27 14:36:21 UTC
I confirmed that after commenting out the auth-zone: configuration that name resolution once again works.  This is with aggressive-nsec: yes.

I have a theory why it works for some people--maybe some people are using unbound as a forwarder to their ISP or router's DNS server, but with dnssec-trigger it is making direct DNS queries starting with the root-servers.

Comment 9 Charles R. Anderson 2018-03-27 14:39:06 UTC
Created attachment 1413755 [details]
unbound.conf under 1.7.0 with aggressive-nsec: yes and auth-zone: removed

Working unbound.conf under 1.7.0 with aggressive-nsec: yes and auth-zone: removed.

Comment 10 Christian Stadelmann 2018-03-29 18:25:27 UTC
(In reply to Charles R. Anderson from comment #8)
> I confirmed that after commenting out the auth-zone: configuration that name
> resolution once again works.  This is with aggressive-nsec: yes.

I can confirm this behavior. When the "auth-zone:" part is commented out, name resolution works fine. With it being present in the config file (i.e. not commented out) it breaks.

(In reply to Charles R. Anderson from comment #8)
> I have a theory why it works for some people--maybe some people are using
> unbound as a forwarder to their ISP or router's DNS server, but with
> dnssec-trigger it is making direct DNS queries starting with the
> root-servers.

Is there an easy command you could provide to check this theory?
According to Gnome-control-center, my DNS server is running at an address in the 192.168.0.0/16 range.

$ nmcli
[…]
DNS configuration:
	servers: 192.168.178.1
	domains: fritz.box
	interface: wlp3s0

	servers: fd00::eadf:70ff:fe4b:a52a
	interface: wlp3s0
[…]

Comment 11 Dimitris 2018-04-10 03:25:42 UTC
1.7.0-4.fc27 seems to fix this for me.  Using default config installed by `dnf upgrade`.

Comment 12 Christian Stadelmann 2018-04-14 20:00:40 UTC
(In reply to Dimitris from comment #11)
> 1.7.0-4.fc27 seems to fix this for me.  Using default config installed by
> `dnf upgrade`.

+1, works for me too.


Note You need to log in before you can comment on or make changes to this bug.