Bug 1647464

Summary: subdomain-delegation to dnsmasq no longer works
Product: [Fedora] Fedora Reporter: Harald Reindl <h.reindl>
Component: dnsmasqAssignee: Petr Menšík <pemensik>
Status: POST --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 30CC: anon.amish, dougsland, itamar, jima, laine, mruprich, msehnout, pavlix, p, pemensik, pzhukov, thozza, veillard, vonsch, zdohnal
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://gitlab.isc.org/isc-projects/bind9/issues/668
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-28 23:55:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
restore replies to non-recursive queries none

Description Harald Reindl 2018-11-07 14:24:25 UTC
for years it was no problem having a dnsmasq on developer machines and a zone-delegation for subdomains so that they can maintain their development hostnames in /etc/hosts and epose it to the rest of the network with a local dnsmasq

for some Fedora leases now it gives SERVFAIL and teaching they guys maintain maned on their local machines is really overkill


cat /var/named/chroot/var/named/zones/example.com.dns | grep openvpn-flow
openvpn-flow                    IN A  
flow-home                       IN NS           openvpn-flow

dig NS flow-home.example.com
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 44943

dig contentlounge.flow-home.example.com.
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 46063


as you can cleary see asking the dnsmasq server directly from the host the main-zone is runnign orks without issues - so what does named wrong?

dig contentlounge.flow-home.example.com. @
contentlounge.flow-home.example.com. 604800 IN A

Comment 1 Harald Reindl 2018-11-07 14:31:29 UTC
see also https://gitlab.isc.org/isc-projects/bind9/issues/668

Comment 2 Petr Menšík 2018-11-07 17:40:38 UTC
Ok, I am not sure what exactly is

Comment 3 Harald Reindl 2018-11-07 17:49:28 UTC
obviously upstream decided to implement some arbitary changes breaking setups which worked for years and you can no longer delegate a subdomain to a machine running dnsmasq feeded by /etc/hosts instead of fully implement dns-zones for each and every development subdomain

it's pervert that upstream calls that as "fixed something"

Comment 4 Petr Menšík 2018-11-08 10:31:27 UTC
What issue is here? Could you explain your configuration more? It is not clear what exactly are you trying to accomplish to me. I can see guys at ISC already locked your issue.

I can understand that using /etc/hosts is a quite nice feature. I use it to manage local instances of VM on my machine. Well, dnsmasq is somehow deficient in usual configuration. I still have do discovery how to work it out correctly.

If you choose to use name server delegation instead of forwarding, auth-server has to be used. If not, dnsmasq records have to be used carefully.

Could you provide how is configured dnsmasq?

I think it is expected from recursive servers to send queries without recursive flag set. That exactly BIND does and I think that exactly dnsmasq does not accept. That is the difference between forward configured in bind (which sends recursive flag set) and authoritative delegation (with is not recursive).

Would this also return requested address?
$ dig +norec contentlounge.flow-home.example.com. @

Comment 5 Harald Reindl 2018-11-08 10:35:41 UTC
no, that is a SERVFAIL and i am reqally pissed that that worked for years without anny issues

;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41465
;; flags: qr ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

Comment 6 Harald Reindl 2018-11-08 10:37:00 UTC
[root@flow-home:~]$ cat /etc/dnsmasq.conf

the problem is obvisouly that dnsmasq gives back idiotic errors and named was "fixed to death"

Comment 7 Petr Menšík 2018-11-08 10:44:56 UTC
(In reply to Harald Reindl from comment #3)
> obviously upstream decided to implement some arbitary changes breaking
> setups which worked for years and you can no longer delegate a subdomain to
> a machine running dnsmasq feeded by /etc/hosts instead of fully implement
> dns-zones for each and every development subdomain
> it's pervert that upstream calls that as "fixed something"

Depends on your setup. I am sure it still can configured somehow, I have myself test domain forwarded between unbound, bind and dnsmasq. You can delegate anything you want. Question is, is dnsmasq able to handle such delegation? AFAIK dnsmasq has no concept of zones built-in. It is more or less smart caching daemon that forwards everything to real resolver somewhere. I think it is problem of dnsmasq or its configuration.

Can you make record queries done by bind to dnsmasq and its responses?
Something like:
$ tcpdump -s0 -i tunX -v host and port domain

it would help you if you had in your bind instead something like:

zone "flow-home.example.com." IN {
  type forward;
  forward only;
  forwarders {; };

I understand that is not as handy as one record in your zone, but it would work instead. It can be scripted to autogenerate into some file in /etc/named/vpn-forwards.conf for example, included from /etc/named.conf.

Comment 8 Harald Reindl 2018-11-08 11:00:02 UTC
the point is that it worked before without needing dnsmasq a "zone concept" and the whole configuration including 800 auth-domains *is* auto-generated 

this is a subdomain of our MAIN-DOMAIN and as i tested it yesterday the forward-only didn't work either, don't know if it's because the main zone exists auto-generated in a different zone-file, anyways it sucks that all the time things are brfeaking right and left and developers at the same time call it a fix

Comment 9 Petr Menšík 2018-11-08 11:16:04 UTC
Can you try adding to dnsmasq


restart dnsmasq and retry dig with +norec flag?

Comment 10 Harald Reindl 2018-11-09 12:21:56 UTC
yeah that does the trick - you are my hero of the day


/etc/hosts and /etc/dnsmasq.conf are synced between two machines
both zones are now working again with a NS-record in named pointing to the host

"dig NS" treuns only a dot but as long as clients got served that's fine :-)

content of the zone in named
flow                            IN A  
flow-home                       IN A  
flow                            IN NS           flow
flow-home                       IN NS           flow-home

Comment 11 Harald Reindl 2018-11-09 12:58:47 UTC
BTW: that is for sure a bug in "dnsmasq"

nslookup is asking fro both, A and AAAA and dnsmasq responds with REFUSED instead NXDOMAIN in case it's configured to not use forwarders (dnsmasq[140020]: warning: no upstream servers configured) which appears taht otherwise it's asking the upstream server and giving back the NXDOMAIN from upstream

[root@testserver:~]$ nslookup webmail.testserver.rhsoft.net
Name:   webmail.testserver.rhsoft.net
** server can't find webmail.testserver.rhsoft.net: REFUSED



[root@testserver:~]$ dig A webmail.testserver.rhsoft.net @

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-10.P2.fc28 <<>> A webmail.testserver.rhsoft.net @
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60626
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 4096
;webmail.testserver.rhsoft.net. IN      A

webmail.testserver.rhsoft.net. 30 IN    A

;; Query time: 0 msec
;; WHEN: Fr Nov 09 13:58:10 CET 2018
;; MSG SIZE  rcvd: 74


[root@testserver:~]$ dig AAAA webmail.testserver.rhsoft.net @

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-10.P2.fc28 <<>> AAAA webmail.testserver.rhsoft.net @
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 6627
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;webmail.testserver.rhsoft.net. IN      AAAA

;; Query time: 0 msec
;; WHEN: Fr Nov 09 13:58:15 CET 2018
;; MSG SIZE  rcvd: 47

Comment 12 Petr Menšík 2018-11-22 12:02:49 UTC
Non-recursive queries were disabled to prevent cache snooping in dnsmasq. That is the reason for REFUSED. Anything registered locally should respond, but without recursion desired flag no other reply is anwered.

Auth-zone is required to respond with NXDOMAIN in question to non-existent record in given zone. It tells dnsmasq it is authoritative for all names under given zone, which is exactly your use case here. It knows there is no reason why should forward it to upstream servers.

If no upstream servers are configured, it may help to configure it as authoritative for everything: auth-zone=.

But I would stick to limited zone.

Comment 13 Harald Reindl 2018-11-22 12:05:15 UTC
but it should not respond with REFUSED when asked for AAAA while an A-record for the same name exists from /etc/hosts

Comment 14 Petr Menšík 2019-04-12 09:18:54 UTC
According to my testing, dnsmasq does not respond with REFUSED when responding to query without query. However, it responds strange to dig +norec requests to records written in /etc/hosts. It seems non-recursive requests do not respond with records from hosts file, but recursive does.

Comment 15 Petr Menšík 2019-04-12 14:33:14 UTC
Created attachment 1554819 [details]
restore replies to non-recursive queries

Comment 16 Petr Menšík 2019-04-12 14:42:55 UTC
It might be REFUSED, because queries without recursion desired flag set are always forwarded to "upstream" servers.

Heck, I would call this recursion, and this behaves in exact opposite the flag is for. I doubt such behaviour is intentional. With +rec, it serves local records, but with +norec it serves always upstream records if there are any. If upstream is again dnsmasq, it would return REFUSED.

Comment 18 Ben Cotton 2019-05-02 19:40:31 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 19 Ben Cotton 2019-05-28 23:55:08 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 20 Petr Menšík 2019-06-11 14:58:28 UTC
Reopening this issue, is not fixed in dnsmasq yet even in latest version.