Bug 1879028 - systemd: Caching DNS resolver is not DNSSEC-aware
Summary: systemd: Caching DNS resolver is not DNSSEC-aware
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedFreezeException
: 1883090 (view as bug list)
Depends On:
Blocks: 1945309 F33FinalFreezeException 1884238
TreeView+ depends on / blocked
 
Reported: 2020-09-15 08:57 UTC by Florian Weimer
Modified: 2022-04-18 20:17 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1884238 (view as bug list)
Environment:
Last Closed: 2021-03-27 12:53:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github systemd systemd issues 10579 0 None open First DNSSEC failure passed on even with allow-downgrade 2021-02-20 14:38:01 UTC
Github systemd systemd issues 17218 0 None closed systemd-resolved's local DNS stub incorrectly returs NOTIMP on DO/CD queries 2021-02-20 14:38:00 UTC
Github systemd systemd issues 18714 0 None closed resolved: DNSSEC records still not passed to applications 2021-03-16 18:53:59 UTC
Github systemd systemd issues 4621 0 None closed resolved stub resolver doesn't provide RRSIG data in replies when DO/CD queries are sent to it 2021-02-20 14:38:01 UTC

Description Florian Weimer 2020-09-15 08:57:10 UTC
How to reproduce:

$ dig +dnssec www.iana.org  

; <<>> DiG 9.11.22-RedHat-9.11.22-1.fc33 <<>> +dnssec www.iana.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41992
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 65494
; OPT=5: 05 07 08 0a 0d 0e 0f (".......")
; OPT=6: 01 02 04 ("...")
; OPT=7: 01 (".")
;; QUESTION SECTION:
;www.iana.org.                  IN      A

;; ANSWER SECTION:
www.iana.org.           2750    IN      CNAME   ianawww.vip.icann.org.
ianawww.vip.icann.org.  30      IN      A       192.0.32.8

;; Query time: 73 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Sep 15 04:56:09 EDT 2020
;; MSG SIZE  rcvd: 112


This should show DNSSEC records, but it does not because systemd-resolved has stripped them.

Systems that rely on the availability of DNSSEC data will break as the result of upgrading to Fedora 33 if they used NetworkManager to manage /etc/resolv.conf.

Comment 1 Michael Catanzaro 2020-09-28 16:58:58 UTC
See https://fedoraproject.org/wiki/Changes/systemd-resolved#DNSSEC for rationale behind disabling DNSSEC.

Comment 2 Simo Sorce 2020-09-28 19:02:52 UTC
(In reply to Michael Catanzaro from comment #1)
> See https://fedoraproject.org/wiki/Changes/systemd-resolved#DNSSEC for
> rationale behind disabling DNSSEC.

Michael,
that document is confusing DNSSEC validation with *forwarding* DNSSEC replies to clients.

There are plenty of clients that know how to do DNSSEC validation themselves, and do not care (or may be do not even want) the resolver to do validation for them.

This bug is about systemd-resolved not reporting the DNS records the client requested. It is not about asking resolved to validate them. It is a very important distinction. currently resolved is incorrectly filtering out responses that it should give and that will cause absolutely no problem whatsoever to clients that do not care about DNSSEC (as they will ask nothing) but completely breaks clients that *are* DNSSEC aware and are requesting information.

This is a *severe* bug for a resolver to have in 2020.

Comment 3 Michael Catanzaro 2020-09-28 21:25:37 UTC
Yeah, but problem is systemd-resolved doesn't have a knob for this (forward but don't validate), and it's rather late at this point to change that in time for F33. This was low priority because it usually only matters for certain server deployments, where you would presumably either (a) enable the DNSSEC option, or (b) disable systemd-resolved.

Testing this now, I find that actually (a) doesn't actually work like I had hoped. Looks like systemd-resolved strips the RRSIG records unconditionally, even with DNSSEC=yes in /etc/systemd/resolved.conf. I don't know much about DNSSEC, so we'll need to ask Zbigniew or Lennart to know why it does that, but I do notice the records are the same regardless of which DNS server I query (`dig @1.1.1.1 +dnssec www.iana.org` vs. `dig @8.8.8.8 +dnssec www.iana.org`), so seems like it should be OK to just pass it along...?

Comment 4 Paul Wouters 2020-09-29 02:46:35 UTC
*** Bug 1883090 has been marked as a duplicate of this bug. ***

Comment 5 Paul Wouters 2020-09-29 02:47:26 UTC
from my (closed duplicate) bug:

When a DNS library or application sends DNS requests with the DO bit set, systemd-resolved does not return the proper DNSSEC records because DNSSEC has been completely disabled.

This breaks any DNS library and program that depends on these records. It also undermines DNS security.

Two concrete examples:

libreswan links against libunbound and uses its own DNSSEC validation based on the forwarders specified in /etc/resolv.conf. With systemd-resolved populating resolv.conf with 127.0.0.53, libreswan is given a broken forwarder. DNSSEC fails and ALL RESOLVING within libreswan fails.

my postfix server uses LetsEncrypt certificates and publishes and consumes TLSA records to validate the SMTP TLS channels. Postfix request DNS from the system resolver and expcects the AD bit for DNSSEC validated answers. systemd-resolved will never return these, so all TLSA query answers are igored by postfix, downgrading my email TLS security to anonymous TLS, which can now be MITM'ed by anyone.

This could be considered a CVE magnitude issue

Comment 6 Paul Wouters 2020-09-29 02:49:58 UTC
note this bug was already reported to upstream 1.5 years ago: https://github.com/systemd/systemd/issues/12317

Comment 7 Petr Menšík 2020-09-29 12:08:43 UTC
If command "delv fedoraproject.org" from bind-utils cannot return fully validated, I think systemd-resolved is not capable to be default system resolver. dnsmasq is not high-quality code, but it can do it successfully.

Even dnsmasq was fixed to pass DNSSEC queries correctly. Systemd-resolved should be higher quality code, please fix it too.

Comment 8 Zbigniew Jędrzejewski-Szmek 2020-09-30 15:22:12 UTC
This was discussed in the FESCo meeting today
(https://meetbot.fedoraproject.org/fedora-meeting-2/2020-09-30/fesco.2020-09-30-14.00.html):
* AGREED: Add #1879028 as FE, close 2476. (+5, 1, -0)

Comment 9 Björn Persson 2020-09-30 16:14:12 UTC
(In reply to Paul Wouters from comment #6)
> note this bug was already reported to upstream 1.5 years ago:
> https://github.com/systemd/systemd/issues/12317

That's not this bug though. That Github issue is about trusting the AD bit in responses from validating recursive servers. This Bugzilla issue is about systemd-resolved's refusal to honor the DO bit in requests from clients that want to do the validation themselves. The usecases are very different. Trusting the AD bit makes sense only in certain controlled circumstances. It can at most be a configuration option, never the default. The DO bit should be honored as a matter of course. If the client requests DNSsec records, and the upstream server can provide them, then I see no reason for anyone but an attacker to withhold those records from the client.

Comment 10 Petr Menšík 2020-09-30 22:40:09 UTC
Okay, this issue might be more relevant [1]. Blind trust to AD bit is not what we would be interested. AD bit should come only when such record is offered by authoritative server or it can be verified it was not modified. It might work only with TLS channel between recursor.

1. https://github.com/systemd/systemd/issues/17218

Comment 11 Michael Catanzaro 2020-10-13 22:25:58 UTC
(I've removed a couple of the linked issues to try to simplify expectations for what upstream bugs need to be fixed here. systemd#4621 seems to be the primary issue.)

Comment 12 Petr Menšík 2020-10-29 09:18:36 UTC
Is there any progress on the support? Can we help somehow? Can we test a candidate? Make a recommendation on intended behaviour? Fedora 33 is out and this bug had not changed status. Is it considered important to fix?

Comment 13 Michael Catanzaro 2020-11-04 15:19:34 UTC
It sounds like we have a rough consensus that these bugs should be fixed, but nobody willing to do the work.

(In reply to Petr Menšík from comment #12)
> Can we help somehow?

If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated.

Comment 14 Zbigniew Jędrzejewski-Szmek 2020-11-05 14:34:26 UTC
> If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated.

Actually, the work on this is already progressing. Some preparatory PRs have been merged
(https://github.com/systemd/systemd/pull/17476, https://github.com/systemd/systemd/pull/17521),
and the PR that actually fixes the issue is in final stages of preparation.

If somebody wants to work on other issues, there's plenty:
https://github.com/systemd/systemd/issues?q=is%3Aopen+is%3Aissue+label%3Aresolve (123),
about 50 not marked as needing feedback or discussion.
It is quite likely that some of those issues are already resolved or invalid in another way,
so bug triage would be helpful too. If you want to work on code, unless it's an isolated bug,
it's probably best to drop a note in the ticket to avoid duplicate work.

Comment 15 Michael Catanzaro 2020-12-02 15:39:08 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #14)
> > If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated.
> 
> Actually, the work on this is already progressing. Some preparatory PRs have
> been merged
> (https://github.com/systemd/systemd/pull/17476,
> https://github.com/systemd/systemd/pull/17521),
> and the PR that actually fixes the issue is in final stages of preparation.

This is great news. Any update on this?

Comment 16 Zbigniew Jędrzejewski-Szmek 2020-12-02 15:53:57 UTC
https://github.com/systemd/systemd/pull/17535 is a huge PR with a number of fixed.
But it's being split into smaller PRs, see the list near the end on github.

Comment 17 Petr Menšík 2021-02-20 14:48:34 UTC
Upstream issue were closed without a good explanation or actual fix. Is there reason, why was it fixed? My testing indicates it still does not allow validation on clients.

Is my checking wrong? It it still planned to fix it?

Comment 18 Zbigniew Jędrzejewski-Szmek 2021-02-20 18:42:02 UTC
Hi Petr,

if there is some upstream discussion, please continue the discussion there. It's mostly the same people
in both places, but not all of them, so by suddenly asking here is just making the whole thing harder
to follow. There is no patch in Fedora yet, and this ticket is not appropriate for upstream discussions.

Comment 19 Petr Menšík 2021-02-22 10:28:05 UTC
I have an impression my remarks at and questions are ignored at upstream. I have no access to reopen unfixed upstream issue, like I have here on bugzilla. I would reopen it myself, but I cannot.

I think I targetted you Zbigniew at upstream 10 days ago in closed ticket, without any response. Because no discussion continued there, I tried it here. Because that ticket got closed without any explanation by you.

Could you tell me a better place for discussions with upstream then? Upstream issues don't work for me.

Is this bug expected to be fixed in f34 at least? Because it got exception for f33, I would think it would be fixed already. Is its priority low? Is there plan to remove systemd-resolved by default at least from Fedora Server?

Comment 20 Zbigniew Jędrzejewski-Szmek 2021-02-22 10:42:03 UTC
Oh, for heaven's sake. Bugtrackers allow asynchronous communication. Nobody has replied to your comment, yet.
Hopefully somebody will get to it at some point. You can't expect to us to reply to your every comment with
"Communication received. Will reply at a later time".

And yes, I also have the impression that your remarks and questions are ignored at upstream. I can't
speak for anyone else authoritatively, but every pointless comment on any of the involved bugtrackers
(and at times you've posted in three or four different places about one issue), increases that impression.

Comment 21 Jan Pazdziora 2021-02-22 10:52:42 UTC
Zbigniew, irrespectively of the upstream discussion and implementation of whatever changes upstream, Petr brings valid question in the context of Fedora distribution which is supposed to integrate technologies into working, useful setup. And that is -- what functionality can admins and users expect WRT DNSSEC with the default settings, for example on Fedora server.

Comment 22 Zbigniew Jędrzejewski-Szmek 2021-02-22 10:59:36 UTC
Yes, the plan is to fix that for F34. As the patches merged in upstream show, which Petr is very well aware
of, since he reviewed some of them, this is being fixed. The fix is not complete at this point, but we're
working on it.

Comment 23 Petr Menšík 2021-02-22 16:20:06 UTC
Is there reason, why was its issue closed, when is the fix not yet complete? It makes tracking the progress of the fix harder (not only) to me. Was there some reason for it? When you are working on it, is there specific reason, why this bug is still in a NEW state?

My attempts to to improve Fedora's DNS service are clearly not well received. How welcoming. As long as systemd-resolved is in default installation, I would report any bug I found. I consider it hard and unwelcoming in systemd, but necessary. I take no fun in pestering you. I do that for Fedora's benefit, not my own pleasure, I assure you.

Thanks Jan for support.

Comment 24 Zbigniew Jędrzejewski-Szmek 2021-02-22 22:44:29 UTC
> Is there reason, why was its issue closed, when is the fix not yet complete?

It was closed automatically by github because a patch which had 'Fixes #nnn' was merged. 
But you opened https://github.com/systemd/systemd/issues/18714, so the fact that the issue is
not fully fixed is being tracked.

> When you are working on it, is there specific reason, why this bug is still in a NEW state?

Because people generally don't bother with Status states in Fedora: if we set it to assigned,
that'd imply that I'm working on it, which is not really true, since other people are working on
it too. Once patches are available, Status will be updated.

>  As long as systemd-resolved is in default installation, I would report any bug I found.

Yes, thank you for this. Your reports are useful.

Comment 25 Zbigniew Jędrzejewski-Szmek 2021-02-24 08:47:10 UTC
https://github.com/systemd/systemd/issues/18714#issuecomment-784623447:
> Confirmed, thank you. This time client validation works both for positive and negative answers.

Comment 26 Michael Catanzaro 2021-03-16 18:50:16 UTC
Hi Petr, has this issue been resolved to your satisfaction?

Comment 27 Michael Catanzaro 2021-03-16 18:57:19 UTC
(Discounting the issue with DNSSEC=allow-downgrade. We know that setting is unreliable.)

Comment 28 Petr Menšík 2021-03-27 10:02:04 UTC
(In reply to Michael Catanzaro from comment #26)
> Hi Petr, has this issue been resolved to your satisfaction?

Not completely, no. Default installation still prevents any verification. If DNSSEC=no, which is still default even on F34, validation is not possible. Even if upstream resolvers are all DNSSEC enabled, systemd-resolved would block validation by third party on stub resolver. It should not validate itself in that case, but I think it should not strip DNSSEC records requested by capable clients.

But I think I would create another bug report for this case, because this is significant improvement to previous state.

Comment 29 Michael Catanzaro 2021-03-27 12:53:55 UTC
OK, a matter for another bug report indeed.


Note You need to log in before you can comment on or make changes to this bug.