Bug 1879028
Summary: | systemd: Caching DNS resolver is not DNSSEC-aware | |||
---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Florian Weimer <fweimer> | |
Component: | systemd | Assignee: | systemd-maint | |
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 33 | CC: | accounts+fedora, amessina, bjorn, dominik, fedoraproject, filbranden, flepied, germano.massullo, johnh, jorti, jpazdziora, lnykryn, mcatanza, mh, msekleta, ngompa13, pbrobinson, pemensik, pwouters, redhat, ssahani, s, ssorce, systemd-maint, tomek, yuwatana, zbyszek, z | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | AcceptedFreezeException | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1884238 (view as bug list) | Environment: | ||
Last Closed: | 2021-03-27 12:53:55 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1766778, 1884238, 1945309 |
Description
Florian Weimer
2020-09-15 08:57:10 UTC
See https://fedoraproject.org/wiki/Changes/systemd-resolved#DNSSEC for rationale behind disabling DNSSEC. (In reply to Michael Catanzaro from comment #1) > See https://fedoraproject.org/wiki/Changes/systemd-resolved#DNSSEC for > rationale behind disabling DNSSEC. Michael, that document is confusing DNSSEC validation with *forwarding* DNSSEC replies to clients. There are plenty of clients that know how to do DNSSEC validation themselves, and do not care (or may be do not even want) the resolver to do validation for them. This bug is about systemd-resolved not reporting the DNS records the client requested. It is not about asking resolved to validate them. It is a very important distinction. currently resolved is incorrectly filtering out responses that it should give and that will cause absolutely no problem whatsoever to clients that do not care about DNSSEC (as they will ask nothing) but completely breaks clients that *are* DNSSEC aware and are requesting information. This is a *severe* bug for a resolver to have in 2020. Yeah, but problem is systemd-resolved doesn't have a knob for this (forward but don't validate), and it's rather late at this point to change that in time for F33. This was low priority because it usually only matters for certain server deployments, where you would presumably either (a) enable the DNSSEC option, or (b) disable systemd-resolved. Testing this now, I find that actually (a) doesn't actually work like I had hoped. Looks like systemd-resolved strips the RRSIG records unconditionally, even with DNSSEC=yes in /etc/systemd/resolved.conf. I don't know much about DNSSEC, so we'll need to ask Zbigniew or Lennart to know why it does that, but I do notice the records are the same regardless of which DNS server I query (`dig @1.1.1.1 +dnssec www.iana.org` vs. `dig @8.8.8.8 +dnssec www.iana.org`), so seems like it should be OK to just pass it along...? *** Bug 1883090 has been marked as a duplicate of this bug. *** from my (closed duplicate) bug: When a DNS library or application sends DNS requests with the DO bit set, systemd-resolved does not return the proper DNSSEC records because DNSSEC has been completely disabled. This breaks any DNS library and program that depends on these records. It also undermines DNS security. Two concrete examples: libreswan links against libunbound and uses its own DNSSEC validation based on the forwarders specified in /etc/resolv.conf. With systemd-resolved populating resolv.conf with 127.0.0.53, libreswan is given a broken forwarder. DNSSEC fails and ALL RESOLVING within libreswan fails. my postfix server uses LetsEncrypt certificates and publishes and consumes TLSA records to validate the SMTP TLS channels. Postfix request DNS from the system resolver and expcects the AD bit for DNSSEC validated answers. systemd-resolved will never return these, so all TLSA query answers are igored by postfix, downgrading my email TLS security to anonymous TLS, which can now be MITM'ed by anyone. This could be considered a CVE magnitude issue note this bug was already reported to upstream 1.5 years ago: https://github.com/systemd/systemd/issues/12317 If command "delv fedoraproject.org" from bind-utils cannot return fully validated, I think systemd-resolved is not capable to be default system resolver. dnsmasq is not high-quality code, but it can do it successfully. Even dnsmasq was fixed to pass DNSSEC queries correctly. Systemd-resolved should be higher quality code, please fix it too. This was discussed in the FESCo meeting today (https://meetbot.fedoraproject.org/fedora-meeting-2/2020-09-30/fesco.2020-09-30-14.00.html): * AGREED: Add #1879028 as FE, close 2476. (+5, 1, -0) (In reply to Paul Wouters from comment #6) > note this bug was already reported to upstream 1.5 years ago: > https://github.com/systemd/systemd/issues/12317 That's not this bug though. That Github issue is about trusting the AD bit in responses from validating recursive servers. This Bugzilla issue is about systemd-resolved's refusal to honor the DO bit in requests from clients that want to do the validation themselves. The usecases are very different. Trusting the AD bit makes sense only in certain controlled circumstances. It can at most be a configuration option, never the default. The DO bit should be honored as a matter of course. If the client requests DNSsec records, and the upstream server can provide them, then I see no reason for anyone but an attacker to withhold those records from the client. Okay, this issue might be more relevant [1]. Blind trust to AD bit is not what we would be interested. AD bit should come only when such record is offered by authoritative server or it can be verified it was not modified. It might work only with TLS channel between recursor. 1. https://github.com/systemd/systemd/issues/17218 (I've removed a couple of the linked issues to try to simplify expectations for what upstream bugs need to be fixed here. systemd#4621 seems to be the primary issue.) Is there any progress on the support? Can we help somehow? Can we test a candidate? Make a recommendation on intended behaviour? Fedora 33 is out and this bug had not changed status. Is it considered important to fix? It sounds like we have a rough consensus that these bugs should be fixed, but nobody willing to do the work. (In reply to Petr Menšík from comment #12) > Can we help somehow? If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated. > If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated. Actually, the work on this is already progressing. Some preparatory PRs have been merged (https://github.com/systemd/systemd/pull/17476, https://github.com/systemd/systemd/pull/17521), and the PR that actually fixes the issue is in final stages of preparation. If somebody wants to work on other issues, there's plenty: https://github.com/systemd/systemd/issues?q=is%3Aopen+is%3Aissue+label%3Aresolve (123), about 50 not marked as needing feedback or discussion. It is quite likely that some of those issues are already resolved or invalid in another way, so bug triage would be helpful too. If you want to work on code, unless it's an isolated bug, it's probably best to drop a note in the ticket to avoid duplicate work. (In reply to Zbigniew Jędrzejewski-Szmek from comment #14) > > If a systemd developer could confirm that pull requests would be accepted, then I bet development help would be appreciated. > > Actually, the work on this is already progressing. Some preparatory PRs have > been merged > (https://github.com/systemd/systemd/pull/17476, > https://github.com/systemd/systemd/pull/17521), > and the PR that actually fixes the issue is in final stages of preparation. This is great news. Any update on this? https://github.com/systemd/systemd/pull/17535 is a huge PR with a number of fixed. But it's being split into smaller PRs, see the list near the end on github. Upstream issue were closed without a good explanation or actual fix. Is there reason, why was it fixed? My testing indicates it still does not allow validation on clients. Is my checking wrong? It it still planned to fix it? Hi Petr, if there is some upstream discussion, please continue the discussion there. It's mostly the same people in both places, but not all of them, so by suddenly asking here is just making the whole thing harder to follow. There is no patch in Fedora yet, and this ticket is not appropriate for upstream discussions. I have an impression my remarks at and questions are ignored at upstream. I have no access to reopen unfixed upstream issue, like I have here on bugzilla. I would reopen it myself, but I cannot. I think I targetted you Zbigniew at upstream 10 days ago in closed ticket, without any response. Because no discussion continued there, I tried it here. Because that ticket got closed without any explanation by you. Could you tell me a better place for discussions with upstream then? Upstream issues don't work for me. Is this bug expected to be fixed in f34 at least? Because it got exception for f33, I would think it would be fixed already. Is its priority low? Is there plan to remove systemd-resolved by default at least from Fedora Server? Oh, for heaven's sake. Bugtrackers allow asynchronous communication. Nobody has replied to your comment, yet. Hopefully somebody will get to it at some point. You can't expect to us to reply to your every comment with "Communication received. Will reply at a later time". And yes, I also have the impression that your remarks and questions are ignored at upstream. I can't speak for anyone else authoritatively, but every pointless comment on any of the involved bugtrackers (and at times you've posted in three or four different places about one issue), increases that impression. Zbigniew, irrespectively of the upstream discussion and implementation of whatever changes upstream, Petr brings valid question in the context of Fedora distribution which is supposed to integrate technologies into working, useful setup. And that is -- what functionality can admins and users expect WRT DNSSEC with the default settings, for example on Fedora server. Yes, the plan is to fix that for F34. As the patches merged in upstream show, which Petr is very well aware of, since he reviewed some of them, this is being fixed. The fix is not complete at this point, but we're working on it. Is there reason, why was its issue closed, when is the fix not yet complete? It makes tracking the progress of the fix harder (not only) to me. Was there some reason for it? When you are working on it, is there specific reason, why this bug is still in a NEW state? My attempts to to improve Fedora's DNS service are clearly not well received. How welcoming. As long as systemd-resolved is in default installation, I would report any bug I found. I consider it hard and unwelcoming in systemd, but necessary. I take no fun in pestering you. I do that for Fedora's benefit, not my own pleasure, I assure you. Thanks Jan for support. > Is there reason, why was its issue closed, when is the fix not yet complete? It was closed automatically by github because a patch which had 'Fixes #nnn' was merged. But you opened https://github.com/systemd/systemd/issues/18714, so the fact that the issue is not fully fixed is being tracked. > When you are working on it, is there specific reason, why this bug is still in a NEW state? Because people generally don't bother with Status states in Fedora: if we set it to assigned, that'd imply that I'm working on it, which is not really true, since other people are working on it too. Once patches are available, Status will be updated. > As long as systemd-resolved is in default installation, I would report any bug I found. Yes, thank you for this. Your reports are useful. https://github.com/systemd/systemd/issues/18714#issuecomment-784623447: > Confirmed, thank you. This time client validation works both for positive and negative answers. Hi Petr, has this issue been resolved to your satisfaction? (Discounting the issue with DNSSEC=allow-downgrade. We know that setting is unreliable.) (In reply to Michael Catanzaro from comment #26) > Hi Petr, has this issue been resolved to your satisfaction? Not completely, no. Default installation still prevents any verification. If DNSSEC=no, which is still default even on F34, validation is not possible. Even if upstream resolvers are all DNSSEC enabled, systemd-resolved would block validation by third party on stub resolver. It should not validate itself in that case, but I think it should not strip DNSSEC records requested by capable clients. But I think I would create another bug report for this case, because this is significant improvement to previous state. OK, a matter for another bug report indeed. |