Bug 1674067 - dnsmasq 2.80 falsifies NXDOMAIN into NODATA
Summary: dnsmasq 2.80 falsifies NXDOMAIN into NODATA
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: dnsmasq
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Petr Menšík
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-08 22:52 UTC by Maciej Żenczykowski
Modified: 2019-08-15 18:51 UTC (History)
9 users (show)

Fixed In Version: dnsmasq-2.80-7.fc30 dnsmasq-2.79-9.fc29
Clone Of:
Environment:
Last Closed: 2019-08-03 01:17:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Maciej Żenczykowski 2019-02-08 22:52:31 UTC
I'm filing this bug based on some testing with dnsmasq 2.80.

It is my belief that dnsmasq is incorrectly converting NXDOMAIN responses from authoritative dns servers into NODATA.

This can result in bad resolution for ipv4-only or ipv6-only hostnames when searching through the search path (a correct dns client library aborts the search at NODATA but continues with the next search path element at NXDOMAIN - any other behaviour results in bugs [flakiness] in the case of server timeouts and other errors).

tea6.foo. and tea7.foo. don't exist.

athina:~$ for i in srv txt aaaa a aaaa a txt srv; do host -t $i tea6.foo. 127.0.0.1 | tail -n 1; done
Host tea6.foo. not found: 3(NXDOMAIN)
Host tea6.foo. not found: 3(NXDOMAIN)
Host tea6.foo. not found: 3(NXDOMAIN)
tea6.foo has no A record
Host tea6.foo. not found: 3(NXDOMAIN)
tea6.foo has no A record
tea6.foo has no TXT record
tea6.foo has no SRV record

athina:~$ for i in srv txt a aaaa a aaaa txt srv; do host -t $i tea7.foo. 127.0.0.1 | tail -n 1; done
Host tea7.foo. not found: 3(NXDOMAIN)
Host tea7.foo. not found: 3(NXDOMAIN)
Host tea7.foo. not found: 3(NXDOMAIN)
tea7.foo has no AAAA record
Host tea7.foo. not found: 3(NXDOMAIN)
tea7.foo has no AAAA record
tea7.foo has no TXT record
tea7.foo has no SRV record

yeah somehow A/AAAA are special (127.0.0.1 is dnsmasq 2.80)

Here's some more detail:

https://umbrella.cisco.com/blog/2014/06/23/nxdomain-nodata-debugging-dns-dual-stacked-hosts/

I'm guessing this bug is introduced by (but unverified):

commit b6f926fbefcd2471699599e44f32b8d25b87b471
Author: Simon Kelley <simon.uk>
Date:   Tue Aug 21 17:46:52 2018 +0100

    Don't return NXDOMAIN to empty non-terminals.
    
    When a record is defined locally, eg an A record for one.two.example then
    we already know that if we forward, eg an AAAA query for one.two.example,
    and get back NXDOMAIN, then we need to alter that to NODATA. This is handled
    by  check_for_local_domain(). But, if we forward two.example, because
    one.two.example exists, then the answer to two.example should also be
    a NODATA.
    
    For most local records this is easy, just to substring matching.
    for A, AAAA and CNAME records that are in the cache, it's more difficult.
    The cache has no efficient way to find such records. The fix is to
    insert empty (none of F_IPV4, F_IPV6 F_CNAME set) records for each
    non-terminal.

Comment 1 Maciej Żenczykowski 2019-02-12 05:14:47 UTC
Norman Rasmussen says:

diff --git a/src/cache.c b/src/cache.c
index 713e58c..2ff05f7 100644
--- a/src/cache.c
+++ b/src/cache.c
@@ -790,6 +790,7 @@ int cache_find_non_terminal(char *name, time_t now)
     if (!is_outdated_cname_pointer(crecp) &&
        !is_expired(now, crecp) &&
        (crecp->flags & F_FORWARD) &&
+       !(crecp->flags & F_NXDOMAIN) &&
        hostname_isequal(name, cache_get_name(crecp)))
       return 1;
 
seems to fix the bug, and doesn't seem to break the logic that the method was introduced for.

Comment 2 Maciej Żenczykowski 2019-02-12 05:20:21 UTC
And some additional comments from Norman:

I have more information about the trigger (using tcpdump, wireshark, dnsmasq --log-queries=extra -d -q --port 5553, and pkill -USR1 dnsmasq):

When the upstream server replies NXDOMAIN that entry is cached:
eg: response for A is cached with flags: "4F   NX" (v4, forwarded, no replay, nxdomain)

The follow up request sees a cached entry for the same name and thinks it MUST NOT return NXDOMAIN,
!!!because there is another cache entry for the same name!!!

I'm guessing that there's a missing logic check that all other cached entries for the same name are NXDOMAIN replies.  So the second entry gets flags of, eg: "6F   N " (v6, forwarded, no reply).

(switching the order of A and AAAA, only switches the 4 with 6, so it's symetric)

Comment 4 Petr Menšík 2019-04-12 08:22:21 UTC
Thanks for the fix pushed into upstream!

Comment 5 Fedora Update System 2019-07-31 18:46:27 UTC
FEDORA-2019-b0b2b9b380 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-b0b2b9b380

Comment 6 Fedora Update System 2019-07-31 19:36:59 UTC
FEDORA-2019-8ad16085e2 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-8ad16085e2

Comment 7 Fedora Update System 2019-08-01 03:28:47 UTC
dnsmasq-2.80-7.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-b0b2b9b380

Comment 8 Fedora Update System 2019-08-01 05:33:52 UTC
dnsmasq-2.79-9.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-8ad16085e2

Comment 9 Fedora Update System 2019-08-03 01:17:05 UTC
dnsmasq-2.80-7.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 10 Fedora Update System 2019-08-15 18:51:39 UTC
dnsmasq-2.79-9.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.