Description of problem: Inability to resolve addresses where only CNAME is present. This effectively breaks all direct usage of DNS beyond glibc resolver including flatpak runtimes that do not have systemd247 or higher and various host utilities. Version-Release number of selected component (if applicable): systemd-networkd-248~rc2-1.fc34.x86_64 How reproducible: Always Steps to Reproduce: 1. nslookup www.netflix.com 127.0.0.53 Actual results: No IP addresses in output, only CNAME Expected results: CNAME and IP addresses it resolves to are in output Additional info: Created upstream bug to https://github.com/systemd/systemd/issues/18819
Fedora 33 Workstation: $ nslookup www.netflix.com 127.0.0.53 Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: www.netflix.com canonical name = www.dradis.netflix.com. www.dradis.netflix.com canonical name = www.us-west-2.internal.dradis.netflix.com. www.us-west-2.internal.dradis.netflix.com canonical name = dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 44.237.234.25 Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 44.234.232.238 Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 44.242.60.85 Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 2600:1f14:62a:de82:822d:a423:9e4c:da8d Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 2600:1f14:62a:de81:b848:82ee:2416:447e Name: dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com Address: 2600:1f14:62a:de80:69a8:7b12:8e5f:855d $ rpm -q systemd systemd-246.10-1.fc33.x86_64 Fedora 34 Workstation: $ nslookup www.netflix.com 127.0.0.53 Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: www.netflix.com canonical name = www.dradis.netflix.com. www.dradis.netflix.com canonical name = www.us-west-2.internal.dradis.netflix.com. $ rpm -q systemd systemd-248~rc2-1.fc34.x86_64
Proposed as a Freeze Exception for 34-beta by Fedora user chrismurphy using the blocker tracking app because: This is a regression, it'd be good to fix it. I'm not sure what criterion this would fall under for a blocker though.
Discussed during the 2021-03-01 blocker review meeting: [0] The decision to classify this bug as an "AcceptedFreezeException (Beta)" was made as it is a noticeable issue that cannot be fixed with an update. [0] https://meetbot.fedoraproject.org/fedora-blocker-review/2021-03-01/f34-blocker-review.2021-03-01-17.01.txt
This is effectively "the internet is broken," so we need to ensure it gets fixed no matter what, regardless of whether it meets any defined blocker criterion.
This must be caused by the recent rework to CNAME handling.
(In reply to Michael Catanzaro from comment #4) > This is effectively "the internet is broken," so we need to ensure it gets > fixed no matter what, regardless of whether it meets any defined blocker > criterion. I agree but could this be explained in both general and practical terms so we might figure out what criterion applies? Or alternatively get fesco to just say it's a blocker? Because I'm getting dnf and GNOME Software updates and package installs, and web browser is working. And yet flatpaks are all over the map, some have working internet others don't, with no discernible pattern.
CNAME records are... sort of like symlinks, but for DNS A and AAAA records instead of files. For example, say I have alias.example.com and want it to be served by the same server that handles foobar.example.com. You could configure it like this: A alias.example.com 192.0.2.0 A foobar.example.com 192.0.2.0 Or you could configure it like this: A foobar.example.com 192.0.2.0 CNAME alias.example.com foobar.example.com They are equivalent. Sort of. (This explanation is probably not 100% technically correct, but that's more or less how it works.) The latter configuration is currently broken, which means a large chunk of the internet will be not working, with no easily-discernable pattern as to why some websites work and others don't. In the case of Netflix, if we run 'dig www.netflix.com' on Fedora 33, which is not broken, we see this: ;; ANSWER SECTION: www.netflix.com. 11 IN CNAME www.dradis.netflix.com. www.dradis.netflix.com. 59 IN CNAME www.us-west-2.internal.dradis.netflix.com. www.us-west-2.internal.dradis.netflix.com. 59 IN CNAME dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. 59 IN A 44.234.232.238 dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. 59 IN A 44.237.234.25 dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. 59 IN A 44.242.60.85 Notice that there is a CNAME record for www.netflix.com pointing to www.dradis.netflix.com, and there is no A record for www.netflix.com, so it matches the second example from above. In Fedora 34, systemd-resolved does not handle the CNAMEs properly, and so we wind up with several "broken links" between www.netflix.com and dualstack.apiproxy-website-nlb-prod-2-e98cb8cf33ff3581.elb.us-west-2.amazonaws.com. In short: the internet is broken. If it's not fixed upstream ASAP, then the bad systemd update should be reverted in the meantime, because an issue like this would effectively block normal usage of Fedora 34. I'm sure not going to upgrade before it's fixed. :P Asking for a special FESCo blocker seems like a good idea to me, because I doubt this fails any of our existing release criteria. The release criteria try to anticipate as far as possible the sort of bugs that are likely to be so bad as to warrant blocking a release, but they can't predict everything. This is the sort of wild issue that's unlikely to ever happen again and is not worth adding to the release criteria unless it can be heavily generalized to something basic like "DNS should work."
(In reply to Chris Murphy from comment #6) > And yet flatpaks are all > over the map, some have working internet others don't, with no discernible > pattern. flatpak shouldn't have *any* impact on this. It may be that something unrelated is going wrong for flatpaks.
See the domain list here: https://pagure.io/fesco/issue/2585#comment-718637 That list successfully resolve in Firefox (flathub) and Ungoogled Chrome (fedora) flatpaks on Fedora 33 with resolv.conf mode: stub, and Fedora 34 with resolv.conf mode: foreign, and managed by NetworkManager. Those domains all fail to resolve on Fedora 34 when resolv.conf mode: stub with those same flatpaks, but resolve with Fedora's rpm Firefox. If flatpaks are excluded, the scope of "the internet is broken" is limited to netflix. That's not likely worth blocking release. If there were a better test to understand the scope of the problem, that would be nice.
So there are two cases here: * freedesktop-sdk <= 20.08 flatpaks: these will use nss-dns, read 127.0.0.53 from /etc/resolv.conf, and speak DNS to systemd-resolved on the host without knowing anything about systemd-resolved * Fedora 33/34 flatpaks, freedesktop-sdk 21.08 flatpaks: these will attempt to use nss-resolve and speak directly to systemd-resolved via varlink. This requires flatpak 1.10 or it will fail. If it fails, then it should fall back to nss-dns and then work the same as with older flatpaks. I don't know however else to explain why names would be less-likely to resolve in flatpaks, because regardless of how the flatpak app speaks to systemd-resolved, it should always receive the same results. I guess there is some other, separate bug that we don't understand and which has not been reported yet. E.g. the fallback from nss-resolve to nss-dns was broken in Fedora 33 for some time, eventually fixed by https://src.fedoraproject.org/rpms/systemd/c/779685bf4b1cdb74f6f20a6153299178a565e506?branch=f33. That particular issue could not have reappeared because the affected code no longer exists in Fedora 34, but it's possible that some sort of similar issue has appeared.
(In reply to Michael Catanzaro from comment #10) > So there are two cases here: Er, it's actually three cases: > * Fedora 33/34 flatpaks, freedesktop-sdk 21.08 flatpaks: these will attempt > to use nss-resolve and speak directly to systemd-resolved via varlink. This > requires flatpak 1.10 or it will fail. If it fails, then it should fall back > to nss-dns and then work the same as with older flatpaks. Because Fedora 33 flatpaks are different. There, the flatpak will attempt to use the older Fedora 33 version of nss-resolve, which will attempt to speak D-Bus to systemd-resolved. That will be blocked by xdg-dbus-proxy because the app will not have permission. Then it should fall back to nss-dns. Fedora 34 flatpaks and freedesktop-sdk 21.08 flatpaks have newer nss-resolve that will use varlink, which should hopefully work if you have flatpak 1.10. (But nobody has ever tested it before now, because the runtimes didn't exist yet. And it's not really possible to test if CNAMEs aren't working!)
(In reply to Chris Murphy from comment #9) > If flatpaks are excluded, the scope of "the internet is broken" is limited > to netflix. That's not likely worth blocking release. If there were a better > test to understand the scope of the problem, that would be nice. FWIW I assumed from the bug description that all CNAMEs were broken. It sounds like that is not the case after all....
(In reply to Michael Catanzaro from comment #11) > (In reply to Michael Catanzaro from comment #10) > > So there are two cases here: > > Er, it's actually three cases: > > > * Fedora 33/34 flatpaks, freedesktop-sdk 21.08 flatpaks: these will attempt > > to use nss-resolve and speak directly to systemd-resolved via varlink. This > > requires flatpak 1.10 or it will fail. If it fails, then it should fall back > > to nss-dns and then work the same as with older flatpaks. > Fedora 34 flatpaks and freedesktop-sdk 21.08 flatpaks have newer nss-resolve > that will use varlink, which should hopefully work if you have flatpak 1.10. > (But nobody has ever tested it before now, because the runtimes didn't exist > yet. And it's not really possible to test if CNAMEs aren't working!) FWIW I did test that freedesktop-sdk 21.08 successfully uses the varlink interface from flatpak, asssumably Fedora 34 runtime will work as well if resolver is shipped. As to exact criteria which CNAME's fail to resolve, it is unknown. It might have to do with multiple levels of indirection but I don't run a DNS server for a test setup. Some CNAME's are fine eg www.youtube.com. (there is a simple CNAME setup there, not multiple levels of indirection)
Just what it's worth, I'm currenly using "ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf" as workaround which makes resolv.conf provide real DNS instead of stub resolver as a workaround. That is possibly less disruptive workaround in general than rollbacking systemd. systemd-resolved varlink interface is just fine, stub resolver is just working incorrectly.
OK, that explains the behavior difference from flatpak then. Anything using nss-dns or reading /etc/resolv.conf directly is broken, but anything using nss-resolve -- our default -- works properly. IMO this should still be a release blocker, but I would downgrade this to a final blocker instead of a beta blocker.
Actually that still doesn't *fully* explain what is going on. Why is www.netflix.com broken in system Firefox? That should be using nss-resolve, not the stub resolver, right? So something must be wrong with more than just the stub resolver?
This is a beta blocker now. https://pagure.io/fesco/issue/2585#comment-718986
>Why is www.netflix.com broken in system Firefox? And today it's working for me! On the same two laptops in the same configuration it had been failing for the past ~2 days.
I found another reproducer www.akamai.com. As said in upstream ticket, this (having CNAME1->CNAME2->A) seems to be a quite common pattern with Akamai sites so there are probably a lot of affected sites.
www.kernel.org have same problem for me (several chained cnames too) Going back to systemd 247.3-3 fixed it for me
Declared a F34 Beta blocker by FESCo with the caveat that "if the scope is really small or something we can revisit next week." https://pagure.io/fesco/issue/2585#comment-718986
Upstream fix: https://github.com/systemd/systemd/pull/18892/
Blocker status supersedes FE status, no need for both.
"And today it's working for me! On the same two laptops in the same configuration it had been failing for the past ~2 days." This is the pattern I'm seeing in the openQA case that I think is caused by this, too: *sometimes* it works, sometimes it doesn't, and it seems to go in patches (no test run will hit it for a while, then *every* test run will hit it for a while...)
Here's a scratch build with the patch backported: https://koji.fedoraproject.org/koji/taskinfo?taskID=63160807 please test and see how it goes for you, thanks.
Are CNAMEs typically ephemeral and this accounts for the variable behavior?
Tested with that scratch build that both www.akamai.com and www.netflix.com have their contents in ANSWERS SECTION as expected and Netflix is accessible again with Firefox inside flatpak.
Works for me as well, and now Firefox and Ungoogled Chromium flatpaks are resolving all the previous sites that were failing.
"Are CNAMEs typically ephemeral and this accounts for the variable behavior?" Not typically, no, you don't usually want to change your DNS records too much. I'd think it must be something else, though I've no idea what. Round-robin responses, possibly.
First-level CNAME's quite typically have very short TTL in certain special scenarios like CDN. If you want to debug further, we need actual sample hostnames for failures. But it seems the systemd-resolved fix was sufficient. The question is will we get regular build with it as backport or will we wait for next systemd RC which assumably has the fix. Note that for openQA use cases it would be helpful to understand the architecture there: is this failing test using a DNS client that accesses /etc/resolv.conf rather than using glibc resolver? If yes, then it would likely be affected by this same thing.
(In reply to Adam Williamson from comment #25) > Here's a scratch build with the patch backported: > https://koji.fedoraproject.org/koji/taskinfo?taskID=63160807 > please test and see how it goes for you, thanks. Netflix, Akamai, Ask Fedora, Kernel.org, and all the other cases seem to be accessible from this build on Silverblue 34.
I'd rather not take a new RC. We're frozen. We need specific backports of specific fixes for the accepted FE and blocker bugs.
FEDORA-2021-ead59f24eb has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-ead59f24eb
i think it is incompletely fixed. I am running fed34 with systemd-248~rc2-3.fc34.x86_64 My internet is IPV4+IPV6 dualstack. Under that i have a windows VM that i use with virt-manager. It uses systemd-resolved as DNS. On the previous buggy version, i had the CNAME bug, nslookup www.intel.com gave me an empty result (because no ip was given with the CNAME). This now works. But, i think there's another bug concerning ipv6. results from the cmd: C:\Users\lethalwp>nslookup Default Server: little.lethalwp Address: 192.168.122.1 > www.intel.com Server: little.lethalwp Address: 192.168.122.1 Non-authoritative answer: Name: e11.dsca.akamaiedge.net ----> THIS ONE IS NOW OK. Address: 23.61.4.6 Aliases: www.intel.com intel11.cn.edgekey.net > mail.google.com Server: little.lethalwp Address: 192.168.122.1 Non-authoritative answer: Name: googlemail.l.google.com Address: 2a00:1450:400e:802::2005 ---> no V4 given? Aliases: mail.google.com C:\Users\lethalwp>ping mail.google.com Ping request could not find host mail.google.com. Please check the name and try again. <--- it tries to reach the IPV6 i suppose? No IPV4 Address available. This makes mail.google.com unavailable in the windows VM. I can't compare with how it worked on fed33 previously, i don't have that system available anymore.
dig shows AAAA&A being returned, i don't know why windows only shows the V6. My windows is also in dualstack, but the v6 is only a link-local. Also i notice insonsistent results?: [lethalwp@little ~]$ dig @192.168.122.1 mail.google.com ; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> @192.168.122.1 mail.google.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28315 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;mail.google.com. IN A ;; ANSWER SECTION: mail.google.com. 5951 IN CNAME googlemail.l.google.com. ;; AUTHORITY SECTION: googlemail.l.google.com. 227 IN AAAA 2a00:1450:400e:802::2005 googlemail.l.google.com. 207 IN A 172.217.17.37 ;; Query time: 0 msec ;; SERVER: 192.168.122.1#53(192.168.122.1) ;; WHEN: dim mar 07 14:35:27 CET 2021 ;; MSG SIZE rcvd: 115 10 seconds later: [lethalwp@little ~]$ dig @192.168.122.1 mail.google.com ; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> @192.168.122.1 mail.google.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13136 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;mail.google.com. IN A ;; ANSWER SECTION: mail.google.com. 5849 IN CNAME googlemail.l.google.com. ;; Query time: 0 msec ;; SERVER: 192.168.122.1#53(192.168.122.1) ;; WHEN: dim mar 07 14:37:08 CET 2021 ;; MSG SIZE rcvd: 81
Well it certainly looks like something is wrong... but that doesn't appear to be related to CNAMEs, does it? So please file a new bug. (Well, ideally two: an upstream bug is most important, but we also need a downstream bug for blocker or freeze exception purposes.)
(In reply to lethalwp from comment #35) > [lethalwp@little ~]$ dig @192.168.122.1 mail.google.com Wait, you're showing results from your router... this is not coming from systemd-resolved's stub resolver. The stub resolver would be 127.0.0.53. This bug report is for issues with CNAMEs using the stub resolver. But it looks like whatever is going wrong for you involves neither CNAMEs nor the stub resolver.
Well, I can repro the same thing here really. nslookup gives both A and AAAA for mail.google.com for real DNS but only AAAA for stub resolver. While incompleteness, I don't think this is blocker on the same level as the original CNAME issue. I didn't notice such issues since I have a fully functional native IPv6 stack so the AAAA responses actually worked for me. File a separate bug *at least* on systemd side. I think it would be clearer to have a separate bug also in RHBZ for this.
Basically compare output of "dig @127.0.0.53 mail.google.com" vs "dig @127.0.0.53 mail.google.com A" vs "dig @127.0.0.53 mail.google.com AAAA" vs "dig @8.8.8.8 mail.google.com" vs "dig @8.8.8.8 mail.google.com AAAA". The responses are wildly different. Also "nslookup mail.google.com 127.0.0.53" vs "nslookup mail.google.com 8.8.8.8". If you query AAAA record, you will get sensible response from stub resolver. But with default or A you get again borken answer section.
I think the mail.google.com is something quite different though since the behaviour I was seeing before stopped reproducing later today. This specific issue I reported though which was reproducible is fixed.
FEDORA-2021-ead59f24eb has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.
I upgraded to F34, I have systemd-248~rc2-3, and anything using the stub resolver is almost unusable. The problem is slightly different now in that A records are broken, not missing CNAMEs, but the effect is the same: a large subset of the internet does not work. Reopening.
Right. In the scope of generic issues I have been also seeing various sites not resolve for a while only to start resolving just when I'm typing a bug report. There's definitely something wrong with systemd-resolved but I haven't yet been able to produce good reproducers so didn't file another bug report. Something that may affect these test cases is the DNS cache. Ensuring clean cache might help reproducing the issues better. Debug logging from systemd-resolved might be useful, iirc there was some environment variable to toggle it if you run a daemon manually.
Testers looking to get more info out of resolved without enabling full systemd debug... $ sudo systemctl edit systemd-resolved Add the following two lines in the section for overrides. [Service] Environment=SYSTEMD_LOG_LEVEL=debug Save it. $ sudo systemctl restart systemd-resolved
To support Michael and Seppi, even with systemd-248~rc2-3.fc34, openQA is running into issues resolving mirrors.fedoraproject.org quite a lot on F34 (but not on previous releases). That's from within Fedora infra, where the record should look like this: ;; QUESTION SECTION: ;mirrors.fedoraproject.org. IN A ;; ANSWER SECTION: mirrors.fedoraproject.org. 300 IN CNAME wildcard.fedoraproject.org. wildcard.fedoraproject.org. 60 IN A 10.3.163.75 wildcard.fedoraproject.org. 60 IN A 10.3.163.76 wildcard.fedoraproject.org. 60 IN A 10.3.163.77 wildcard.fedoraproject.org. 60 IN A 10.3.163.74 it returns the four A records for wildcard in a different order each time, it's a round-robin setup.
Created attachment 1762882 [details] debug log of resolve failure with latest systemd So I got a debug log from a resolve failure in openQA (thanks Chris), here it is.
https://koji.fedoraproject.org/koji/taskinfo?taskID=63668153 is a scratch build with a workaround mcatanzaro suggested: it includes a config snippet that should disable resolved's cache. If affected folks could test it out that'd be great.
FEDORA-2021-c2bfa5e4f6 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-c2bfa5e4f6
systemd-248~rc2-6.fc34 f34 now matches f33 behavior, however for www.vox.com i get different answer results using dig.0.53 ;; ANSWER SECTION: www.vox.com. 54460 IN CNAME vox-chorus.map.fastly.net. vox-chorus.map.fastly.net. 21 IN A 151.101.69.52 dig.8.8 ;; ANSWER SECTION: www.vox.com. 13853 IN CNAME vox-chorus.map.fastly.net. vox-chorus.map.fastly.net. 10 IN A 151.101.1.52 vox-chorus.map.fastly.net. 10 IN A 151.101.65.52 vox-chorus.map.fastly.net. 10 IN A 151.101.129.52 vox-chorus.map.fastly.net. 10 IN A 151.101.193.52
Does it work? As long as it's a fastly IP that successfully loads www.vox.com, that's probably fine. The stub resolver is not expected to return the same results as a real DNS server. (In reply to Adam Williamson from comment #47) > https://koji.fedoraproject.org/koji/taskinfo?taskID=63668153 is a scratch > build with a workaround mcatanzaro suggested: it includes a config snippet > that should disable resolved's cache. If affected folks could test it out > that'd be great. Oh good idea. That should at least significantly reduce the impact of this bug.
The scratch build worked well in some openQA tests I ran, so I sent out an official build and update with the same change. That's https://bodhi.fedoraproject.org/updates/FEDORA-2021-c2bfa5e4f6 . Please test it and see how it works for you. RC2 will include it.
FEDORA-2021-c2bfa5e4f6 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-c2bfa5e4f6` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-c2bfa5e4f6 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
(In reply to Adam Williamson from comment #47) > https://koji.fedoraproject.org/koji/taskinfo?taskID=63668153 is a scratch > build with a workaround mcatanzaro suggested: it includes a config snippet > that should disable resolved's cache. If affected folks could test it out > that'd be great. This build is good. Of course it would be better to not disable the DNS cache, but now we have downgraded a release blocker to just a regular bug. Excellent.
FEDORA-2021-c2bfa5e4f6 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.
-6 went stable and does seem to work around this successfully, so I'm gonna call that good enough to drop blocker status. There's now a proper fix pending upstream, I will do a build+update for that tonight or tomorrow if no-one else gets to it first.
I am having various issues on F34 testing (systemd-248~rc2-8.fc34) but not stable (systemd-248~rc2-6.fc34). I am not technical enough to know if this is related, but I'll post it here anyway. On systemd-248~rc2-8.fc34 I cannot access https://wiki.gnome.org/, many fonts and images are not rendered properly on various sites, I cannot accept my self-signed certificate in GNOME calendar flatpak (but I can in nautilus and Firefox), and I cannot access Flathub on the command line: ``` [andythurman@rockhopper ~]$ flatpak install --verbose --ostree-verbose foo F: No installations directory in /etc/flatpak/installations.d. Skipping F: Opening system flatpak installation at path /var/lib/flatpak F: Opening user flatpak installation at path /var/home/andythurman/.local/share/flatpak Looking for matches… F: Calling system helper: GenerateOciSummary F: Fetching summary index file for remote ‘flathub’ F: Loading https://dl.flathub.org/repo/summary.idx using libsoup F: Failed to download optional summary index: Could not connect: Network is unreachable F: An error was encountered searching remote ‘flathub’ for ‘foo’: Unable to load summary from remote flathub: Could not connect: Network is unreachable F: Fetching summary index file for remote ‘flathub-beta’ F: Loading https://dl.flathub.org/beta-repo/summary.idx using libsoup F: Failed to download optional summary index: Could not connect: Network is unreachable F: An error was encountered searching remote ‘flathub-beta’ for ‘foo’: Unable to load summary from remote flathub-beta: Could not connect: Network is unreachable F: Fetching summary index file for remote ‘gnome-nightly’ F: Loading https://nightly.gnome.org/repo/summary.idx using libsoup F: Failed to download optional summary index: Could not connect: Network is unreachable F: An error was encountered searching remote ‘gnome-nightly’ for ‘foo’: Unable to load summary from remote gnome-nightly: Could not connect: Network is unreachable Found similar ref(s) for ‘foo’ in remote ‘fedora’ (system). Use this remote? [Y/n]: n error: No remote chosen to resolve matches for ‘foo’ ```
Just note that I see perhaps related network issues in F34 toolbox as well: https://bugzilla.redhat.com/show_bug.cgi?id=1934788.
Thanks a lot for reporting. So either I muffed the patch backport, or we still have issues here. I'll ask Lennart to take a look at it.
(In reply to Adam Williamson from comment #58) > Thanks a lot for reporting. So either I muffed the patch backport, or we > still have issues here. I'll ask Lennart to take a look at it. Scratch that. This seems to be unrelated as when overriding the old systemd into 34-testing the issue persists. I'm going to dig a little deeper.
I've tested Adam's systemd-248~rc2-8.fc34 and it fixes this issue for me.
I've seen the same issues when using flatpak from the command line on systemd-248~rc2-8.fc34 as Andrew has... and systemd-248~rc2-6.fc34 is working fine here. I've rolled between versions using Silverblue on the same machine and also tested it on my non-Silverblue laptop, which is on F34 beta with systemd-248~rc2-6.fc34. It's even a problem outside of flatpak, as the issue arises when trying to use other network commands on dl.flathub.org, such as `ping dl.flathub.org` (and mtr / traceroute). Once in a while, ping worked, but most of the time, I'd get: $ ping dl.flathub.org /usr/bin/ping: connect: Network is unreachable Whereas it always works in systemd-248~rc2-6.fc34. Not sure if it matters, but my home network is IPv4 & IPv6 whereas my ISP only provides IPv4. (Wild guess: It could be trying to route to an external IPv6 address and failing?)
Hm... Garrett, could you please post the output of: $ ping -v -c1 dl.flathub.org $ resolvectl query dl.flathub.org $ dig dl.flathub.org $ dig @1.1.1.1 dl.flathub.org At least the output when ping is failing, though if you're able to see a difference between good and bad output, that would be good too. Adam, I think we'd better stick with -6 for F33 beta.
Yes, Beta RC3 has -6 still, we didn't pull in -8.
Looks like I'm having trouble with -8 also, with retrace.fedoraproject.org. [adamw@xps13k ~]$ ping -v retrace.fedoraproject.org ping: connect: Network is unreachable [adamw@xps13k ~]$ resolvectl query retrace.fedoraproject.org retrace.fedoraproject.org: 2620:52:3:1:dead:beef:cafe:c005 -- link: wlp58s0 (retrace03.rdu-cc.fedoraproject.org) -- Information acquired via protocol DNS in 2.3ms. -- Data is authenticated: no; Data was acquired via local or encrypted transport: no -- Data from: cache [adamw@xps13k ~]$ dig @127.0.0.53 retrace.fedoraproject.org ; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> @127.0.0.53 retrace.fedoraproject.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47990 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 65494 ;; QUESTION SECTION: ;retrace.fedoraproject.org. IN A ;; ANSWER SECTION: retrace.fedoraproject.org. 53 IN CNAME retrace03.rdu-cc.fedoraproject.org. retrace03.rdu-cc.fedoraproject.org. 6953 IN A 8.43.85.61 ;; AUTHORITY SECTION: retrace03.rdu-cc.fedoraproject.org. 53 IN AAAA 2620:52:3:1:dead:beef:cafe:c005 ;; Query time: 0 msec ;; SERVER: 127.0.0.53#53(127.0.0.53) ;; WHEN: Thu Mar 18 13:32:35 PDT 2021 ;; MSG SIZE rcvd: 129 [adamw@xps13k ~]$ dig @1.1.1.1 retrace.fedoraproject.org ; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> @1.1.1.1 retrace.fedoraproject.org ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5324 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;retrace.fedoraproject.org. IN A ;; ANSWER SECTION: retrace.fedoraproject.org. 300 IN CNAME retrace03.rdu-cc.fedoraproject.org. retrace03.rdu-cc.fedoraproject.org. 86400 IN A 8.43.85.61 ;; Query time: 715 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: Thu Mar 18 13:32:43 PDT 2021 ;; MSG SIZE rcvd: 101
the output from 'resolvectl' compared to the result of 'dig' seems revealing. resolvectl is giving the IPv6 address. I do not have IPv6 connectivity.
You also proved it's not an issue with the stub resolver this time, though, because it happens with 'resolvectl query', which does not use the stub resolver, and the stub resolver returns only an A record in the ANSWER section. I suspected this already, since Garrett mentioned the issue is occurring outside flatpaks. So I think it's time to create a new (final blocker) bug report, as we seem to have finally reached the end of the stub resolver CNAME madness, and now have something totally different here. FWIW it seems like retrace.fedoraproject.org really is broken (or dropping ICMPv6), because it's not pingable via 2620:52:3:1:dead:beef:cafe:c005, even though I do have working IPv6.
I am experiencing a similar problem on a fresh Fedora 34 installation after all available updates are installed. My computer has a link-local IPv6 address, but I only have IPv4 access to the internet. What I did: 1. Installed Fedora 34 (Fedora-Workstation-Live-x86_64-34-20210317.n.0.iso) 2. Installed updates (sudo dnf --refresh upgrade). After that, I am occasionally getting errors when using dnf or flatpak to check for updates or to download new packages. But the problem happens only occasionally, sometimes it works without any problem. Sometimes, "sudo dnf --refresh upgrade" cannot update repository metadata and gives me a bunch of Curl errors: Errors during downloading metadata for repository 'fedora-modular': - Curl error (7): Couldn't connect to server for https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-34&arch=x86_64 [] At the same time, "ping mirrors.fedoraproject.org" returns "Network is unreachable", I cannot access https://mirrors.fedoraproject.org from Firefox, and dig outputs: [david@pc4 ~]$ dig mirrors.fedoraproject.org ; <<>> DiG 9.16.11-RedHat-9.16.11-5.fc34 <<>> mirrors.fedoraproject.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13998 ;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 6, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 65494 ;; QUESTION SECTION: ;mirrors.fedoraproject.org. IN A ;; ANSWER SECTION: mirrors.fedoraproject.org. 37 IN CNAME wildcard.fedoraproject.org. wildcard.fedoraproject.org. 37 IN A 209.132.190.2 wildcard.fedoraproject.org. 37 IN A 152.19.134.142 wildcard.fedoraproject.org. 37 IN A 38.145.60.21 wildcard.fedoraproject.org. 37 IN A 8.43.85.67 wildcard.fedoraproject.org. 37 IN A 38.145.60.20 wildcard.fedoraproject.org. 37 IN A 8.43.85.73 wildcard.fedoraproject.org. 37 IN A 140.211.169.206 wildcard.fedoraproject.org. 37 IN A 67.219.144.68 wildcard.fedoraproject.org. 37 IN A 152.19.134.198 wildcard.fedoraproject.org. 37 IN A 140.211.169.196 ;; AUTHORITY SECTION: wildcard.fedoraproject.org. 37 IN AAAA 2610:28:3090:3001:dead:beef:cafe:fed3 wildcard.fedoraproject.org. 37 IN AAAA 2620:52:3:1:dead:beef:cafe:fed7 wildcard.fedoraproject.org. 37 IN AAAA 2605:bc80:3010:600:dead:beef:cafe:feda wildcard.fedoraproject.org. 37 IN AAAA 2605:bc80:3010:600:dead:beef:cafe:fed9 wildcard.fedoraproject.org. 37 IN AAAA 2620:52:3:1:dead:beef:cafe:fed6 wildcard.fedoraproject.org. 37 IN AAAA 2604:1580:fe00:0:dead:beef:cafe:fed1 ;; Query time: 1 msec ;; SERVER: 127.0.0.53#53(127.0.0.53) ;; WHEN: Thu Mar 18 15:58:18 EDT 2021 ;; MSG SIZE rcvd: 405 When I run "curl --verbose https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-34&arch=x86_64", it seems to try those IPv6 addresses and fails with "Network is unreachable" because my internet connection is IPv4. But sometimes everything works fine. I noticed that in such case, "dig mirrors.fedoraproject.org" does not print the authority section with AAAA records and dnf can successfully download files from the internet, also curl and ping are successful.
The output again confirms the stub resolver is working properly (only 'A' records in the ANSWER section), so it's time for a new bug report please.
(In reply to Michael Catanzaro from comment #66) > [snip] So I think it's time to create a new (final blocker) bug report, as we > seem to have finally reached the end of the stub resolver CNAME madness, and > now have something totally different here. Has anyone debugged the problems so far, and reported the blocker? I can't reproduce myself, but we have bug 1933506 that somewhat transitively links here.
I'm not aware of any downstream bug report yet. There is an upstream report at https://github.com/systemd/systemd/issues/19049.
I filed https://bugzilla.redhat.com/show_bug.cgi?id=1940715 .
Sorry, wrong link. I filed https://bugzilla.redhat.com/show_bug.cgi?id=1947214 .
goddamnit. wrong bug. i have too many tabs.