1141724 – Unbound cannot distinguish between a negative response, and dropped packets

Bug 1141724 - Unbound cannot distinguish between a negative response, and dropped packets

Summary: Unbound cannot distinguish between a negative response, and dropped packets

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	unbound
Sub Component:
Version:	26
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Paul Wouters
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-09-15 10:23 UTC by William Brown
Modified:	2018-05-29 12:37 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2018-05-29 12:37:09 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description William Brown 2014-09-15 10:23:32 UTC

Description of problem:
Using unbound with dnssec-trigger, I often needed to flush the DNS cache, to clear my negative cache. The reason: Unbound treats dropped (packets) responses from wireless as true negative entries. In order to browse the web I routinely need to flush the negative cache so that DNS will resolve. (To get to bugzilla I had to flush twice ... )

Now, while this may just seem like I have a dodgy wireless access point (Which I don't I just don't get good signal) this is a COMMON issue for people. Worse, ISP's with low quality links would simulate the same.

Unbound should distinguish a response with no data from missing responses. It should retry the later, give up if not working but NOT to cache the result. 

Version-Release number of selected component (if applicable):
unbound-1.4.21-3.fc20.x86_64

How reproducible:
Always.


Steps to Reproduce:
1. Install dnssec-trigger
2. Drop DNS packets

Actual results:
Unbound caches dropped responses as true negative entries.

Expected results:
Unbound should treat dropped responses as exactly that: Dropped responses. It should retry when queried next rather than forcing the user to continually flush the cache.

Comment 1 Pavel Šimerda (pavlix) 2014-09-15 13:12:13 UTC

(In reply to William Brown from comment #0)
> Unbound treats dropped (packets)
> responses from wireless as true negative entries.

That sounds fatally wrong, but I suspected there's something like that might be happening.

Comment 2 William Brown 2014-09-15 13:20:08 UTC

[root@ammy ~]# unbound-control flush_zone .
ok removed 119 rrsets, 126 messages and 7 key entries
[root@ammy ~]# dig @127.0.0.1 ark.intel.com

; <<>> DiG 9.9.4-P2-RedHat-9.9.4-15.P2.fc20 <<>> @127.0.0.1 ark.intel.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 18288
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ark.intel.com.			IN	A

;; Query time: 27 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Sep 15 22:48:48 ACST 2014
;; MSG SIZE  rcvd: 42

[root@ammy ~]# dig @172.24.9.3 ark.intel.com

; <<>> DiG 9.9.4-P2-RedHat-9.9.4-15.P2.fc20 <<>> @172.24.9.3 ark.intel.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58004
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 9, ADDITIONAL: 9

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ark.intel.com.			IN	A

;; ANSWER SECTION:
ark.intel.com.		261	IN	CNAME	ark.intel.com.edgesuite.net.
ark.intel.com.edgesuite.net. 9228 IN	CNAME	a947.b.akamai.net.
a947.b.akamai.net.	11	IN	A	150.101.98.17
a947.b.akamai.net.	11	IN	A	150.101.98.8

;; AUTHORITY SECTION:
b.akamai.net.		961	IN	NS	n3b.akamai.net.
b.akamai.net.		961	IN	NS	n6b.akamai.net.
b.akamai.net.		961	IN	NS	n0b.akamai.net.
b.akamai.net.		961	IN	NS	n8b.akamai.net.
b.akamai.net.		961	IN	NS	n7b.akamai.net.
b.akamai.net.		961	IN	NS	n5b.akamai.net.
b.akamai.net.		961	IN	NS	n1b.akamai.net.
b.akamai.net.		961	IN	NS	n2b.akamai.net.
b.akamai.net.		961	IN	NS	n4b.akamai.net.

;; ADDITIONAL SECTION:
n4b.akamai.net.		10151	IN	A	150.101.98.73
n0b.akamai.net.		2983	IN	A	88.221.81.193
n7b.akamai.net.		10151	IN	A	150.101.98.73
n5b.akamai.net.		10133	IN	A	150.101.98.74
n6b.akamai.net.		2983	IN	A	150.101.98.76
n3b.akamai.net.		2983	IN	A	184.51.124.26
n2b.akamai.net.		6279	IN	A	150.101.98.76
n1b.akamai.net.		10151	IN	A	88.221.81.194

;; Query time: 1 msec
;; SERVER: 172.24.9.3#53(172.24.9.3)
;; WHEN: Mon Sep 15 22:48:57 ACST 2014
;; MSG SIZE  rcvd: 433

Comment 3 Paul Wouters 2014-09-15 14:15:15 UTC

I agree this is a bug. I have experienced similar failures.

What is really needed here is an option for unbound to be much more aggressive when it is deployed as a "smart stub", eg on a laptop or phone. It should be a lot more aggressive, especially in the light of packet loss.

Of course, that behaviour can never be made the default because as a network wide cache, such behaviour would open up the cache to some serious DDOS, by simply querying for known broken zones with bad nameservers.

Comment 4 Pavel Šimerda (pavlix) 2014-09-15 14:27:15 UTC

(In reply to Paul Wouters from comment #3)
> I agree this is a bug. I have experienced similar failures.
> 
> What is really needed here is an option for unbound to be much more
> aggressive when it is deployed as a "smart stub", eg on a laptop or phone.
> It should be a lot more aggressive, especially in the light of packet loss.
> 
> Of course, that behaviour can never be made the default because as a network
> wide cache, such behaviour would open up the cache to some serious DDOS, by
> simply querying for known broken zones with bad nameservers.

Sounds like a suggestion separate from fixing this particular bug report.

Comment 5 Paul Wouters 2014-09-15 14:36:46 UTC

can you configure this in unbound.conf as a test:

val-bogus-ttl: 3

If that works, we could let dnssec-trigger send an unbound-control set-option for that.

Comment 6 William Brown 2014-09-21 04:21:10 UTC

It seems to be better, but I would like to test for a few more days in a variety of circumstances before I claim this to be the case.

Comment 7 Pavel Šimerda (pavlix) 2014-09-21 12:29:02 UTC

Shouldn't we rather fix unbound to avoid caching non-answers if that was the main issue? Also the documentation (as well as the name) speaks about bogus data, not about unanswered queries.

Comment 8 Tomáš Hozza 2014-09-22 07:48:50 UTC

(In reply to Pavel Šimerda (pavlix) from comment #7)
> Shouldn't we rather fix unbound to avoid caching non-answers if that was the
> main issue? Also the documentation (as well as the name) speaks about bogus
> data, not about unanswered queries.

I'm for fixing unbound to not interpret no answer as bogus data. Pavel can you
please discuss the possible reasons behind this behavior with upstream first?

Thank you.

Comment 9 William Brown 2015-01-13 11:31:08 UTC

Hi,

Has this issue been resolved?

Comment 10 Pavel Šimerda (pavlix) 2015-01-26 14:49:12 UTC

Haven't looked at it, yet. Any information welcome.

Comment 11 Jan Kurik 2015-07-15 14:37:51 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Comment 12 Fedora End Of Life 2016-11-24 11:13:18 UTC

This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 13 Fedora End Of Life 2016-12-20 12:53:25 UTC

Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 14 Fedora End Of Life 2017-02-28 09:37:51 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 15 Fedora End Of Life 2018-05-03 08:19:03 UTC

This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 16 Fedora End Of Life 2018-05-29 12:37:09 UTC

Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.