Bug 2182745 - [RFE] Make getaddrinfo() with hints ai_flags==AF_UNSPEC query only AF present on the system
Summary: [RFE] Make getaddrinfo() with hints ai_flags==AF_UNSPEC query only AF present...
Keywords:
Status: ASSIGNED
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Florian Weimer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 844921 1027452
Blocks: 2182803
TreeView+ depends on / blocked
 
Reported: 2023-03-29 14:10 UTC by Petr Menšík
Modified: 2023-10-11 20:35 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github systemd systemd pull 14125 0 None Merged resolved-dns-query: remove dns_query_candidate_is_routable 2023-06-01 09:02:43 UTC
Red Hat Bugzilla 844921 0 unspecified CLOSED AI_ADDRCONFIG does not suppress IN A lookups from IPv6-only hosts 2023-06-10 18:00:35 UTC
Red Hat Bugzilla 2211623 0 unspecified NEW resolv.conf man page needs update for no-aaaa from 2096189 2023-08-16 08:10:00 UTC
Sourceware 12377 0 P2 NEW getaddrinfo() with AI_ADDRCONFIG won't suppress AAAA DNS queries when only IPv6 loopback and link-local addresses are pr... 2023-06-10 18:00:35 UTC
Sourceware 19697 0 P2 NEW /etc/gai.conf option to configure AF_UNSPEC lookups 2023-06-12 15:02:32 UTC
Sourceware 30544 0 P2 NEW [RFE] Make getaddrinfo() with hints ai_flags==AF_UNSPEC query only AF present on the system 2023-06-12 15:02:32 UTC

Description Petr Menšík 2023-03-29 14:10:14 UTC
Description of problem:
There are still quite many networks offering just single address family connectivity. Quite a lot networks are still IPv4 only. They do not offer IPv6 route, they do not offer IPv6 addresses. What sense does it make on such networks to request AAAA queries from dns NSS plugin? Unless the application wants IPv6 explicitly, use just families with working connectivity. Application without AI_PASSIVE flag is very likely to call connect(2). Which will try first IPv6 addresses for the hostname. When a IPv6 route is missing they fail immediately. And they always have to fail, that is known before it is started.


Version-Release number of selected component (if applicable):
glibc-2.37.9000-4.fc39.x86_64

How reproducible:
reliable

Steps to Reproduce:
1. have just IPv4 connectivity. ip -6 route reports just localhost and link-local networks
2. getent ahosts example.org
3.

Actual results:
# getent ahosts example.org
93.184.216.34   STREAM example.org
93.184.216.34   DGRAM  
93.184.216.34   RAW    
2606:2800:220:1:248:1893:25c8:1946 STREAM 
2606:2800:220:1:248:1893:25c8:1946 DGRAM  
2606:2800:220:1:248:1893:25c8:1946 RAW 

Expected results:
# getent ahosts example.org
93.184.216.34   STREAM example.org
93.184.216.34   DGRAM  
93.184.216.34   RAW  

Additional info:
To keep backward compatibility and forward compatibility at the same time, I propose to add two new options into /etc/resolv.conf. ipv4 option would be set when IPv4 connectivity is present. ipv6 option would be set when IPv6 route exists. which is not localhost and not link-local address. There are modules capable of doing domething interesting even if the network itself does not provide IPv6 connectivity. files NSS module should work in any case. mdns NSS plugin provided by nss-mdns could work even with just link-local addresses in and an usable way. But dns protocol does not include scope_id required for link-local addresses. That makes them unusable in form of hostnames.

If both options ipv4 ipv6 were used or none of them would be present, the behaviour should be unchanged from now. This change would make AF_UNSPEC to query only addresses usable at the moment. Unlike AI_ADDRCONFIG it would not spend extra cycles before each connection, but would rely on external service to watch changing connectivity.

It is already common that /etc/resolv.conf is maintained by external service. Be it Network Manager or systemd-resolved, they are as a daemon better suited to monitor connectivity and inform applications by changing options as soon as it changes. This change would help both legacy IPv4 only network and future IPv6 only networks as well. If the network does not provide IPv4 connectivity at all, common application does not need A queries anyway. It is likely some IPv6 translation is being done for legacy connectivity on such networks.

I started thinking about why end systems are doing unnecessary AAAA queries, which in turn people try to block on local DNS caches. This is just ridiculous circle.

Comment 1 Florian Weimer 2023-04-18 15:13:21 UTC
The internal gethostbyname4 interface always makes a dual query (both address families) and is the only interface that can provide IPv6 scope IDs. There is also no way to turn of the dual queries. This is why I think we'd have to introduce another revision (gethostbyname5), and maybe that could receive the hints data structure.

Then the rest can probably be handled with one or two additional flags in /etc/gai.conf.

But all in all, I don't think this is a small project.

Comment 2 Petr Menšík 2023-05-23 12:33:40 UTC
Found out there is already some support for RES_NOAAAA, which kind of attempts to do something similar, just only for IPv6. That is used in _nss_dns_gethostbyname4_r. But nothing similar is available for just IPv6 response. I have not found any trace in man resolv.conf about no-aaaa. Is that oversight or an intention?

It partially implements what I wanted, altough it uses just negative option. It seems to me it could be trivially extended. Does addition of new RES_ options require always addition of new _nss_dns_gethostbynameX_r variant? It seems to me binary compatibility would not change, just internal implementation of the function would. I do not see a reason to make a new variant. But could be required, because it seems no-aaaa changes all queries. Not just those with AF_UNSPEC hints. So maybe the difference is significant.

I do not want to handle this in /etc/gai.conf, because that file is not changed by common DHCP clients or similar daemons, like is Network Manager. It needs just change in DNS nss plugin, which is the only one reading (and watching I assume) /etc/resolv.conf. Also gai.conf is not checked for modifications by default, where resolv.conf already is.

Comment 3 Petr Menšík 2023-05-23 14:27:42 UTC
Oh, I think I am getting the reason. _nss_dns_gethostbyname3_r contains af parameter, but _nss_dns_gethostbyname4_r doesn't have any similar parameter passed to the code. So it seems it cannot make different decisions in latest version, because it does not know what exactly were requested by the caller.

enum nss_status
_nss_files_gethostbyname3_r (const char *name, int af, struct hostent *result,
			     char *buffer, size_t buflen, int *errnop,
			     int *herrnop, int32_t *ttlp, char **canonp);
// versus
enum nss_status
_nss_files_gethostbyname4_r (const char *name, struct gaih_addrtuple **pat,
			     char *buffer, size_t buflen, int *errnop,
			     int *herrnop, int32_t *ttlp);

But if I understand well get_nss_addresses in sysdeps/posix/getaddrinfo.c, gethostbyname4_r is called only for PF_UNSPEC. So this indeed should be the place where the change is needed only. So extending a bit NOAAAA might be a way to do that.

Comment 4 Petr Menšík 2023-06-01 09:02:44 UTC
There were kind of such feature in systemd-resolved, but were removed in PR https://github.com/systemd/systemd/pull/14125. I think that should have been just tweaked to behave correctly and kept there.

Created bug #2211623 to request documentation of no-aaaa in its manual page. If that is already implemented, maybe addition of no-a would be sufficient. Though that looks weird.

Comment 5 Florian Weimer 2023-06-01 09:14:25 UTC
(In reply to Petr Menšík from comment #2)
> Found out there is already some support for RES_NOAAAA, which kind of
> attempts to do something similar, just only for IPv6. That is used in
> _nss_dns_gethostbyname4_r. But nothing similar is available for just IPv6
> response. I have not found any trace in man resolv.conf about no-aaaa. Is
> that oversight or an intention?

The no-aaaa option is only intended as a diagnostic aid, not something for production use. It's specifically targeted at DNS requests only. We added it because people sometimes see the AAAA queries and think they are related to the problem they are experiencing even if that's rather unlikely. The no-aaaa option provides a way to quickly test that.

> I do not want to handle this in /etc/gai.conf, because that file is not
> changed by common DHCP clients or similar daemons, like is Network Manager.
> It needs just change in DNS nss plugin, which is the only one reading (and
> watching I assume) /etc/resolv.conf. Also gai.conf is not checked for
> modifications by default, where resolv.conf already is.

If it affects getaddrinfo for all NSS backends (not just nss_dns, but also nss_resolve and so on), then it really should go into gai.conf.

Comment 6 Petr Menšík 2023-06-01 09:53:58 UTC
(In reply to Florian Weimer from comment #5)
> If it affects getaddrinfo for all NSS backends (not just nss_dns, but also
> nss_resolve and so on), then it really should go into gai.conf.

I think I have clarified that already. This should affect dns backend *only*. It should not affect hosts database or mdns or other alternatives. Because quite often they are able to provide useful and working link-local addresses, which would be still working. If other alternatives wants to limit their queries too, it should be configurable per backend. But this should be nss_dns backend change only.

What I don't like on current no-aaaa implementation is it overrides even explicit request to get AF_INET6 address. That is what I would not want to. I want to modify just AF_UNSPEC behaviour and keep AF_INET and AF_INET6 queries unmodified. But no-aaaa for AF_UNSPEC is what I would like it to behave for ipv4 option.

Tested with a tool from netresolve-tools, getaddrinfo.

$ getaddrinfo -6 example.org
query:
  nodename = example.org
  servname = (null)
  family = 10
status = -5

# with option no-aaaa
$ getaddrinfo --dgram -6 example.org

query:
  nodename = example.org
  servname = (null)
  family = 10
  socktype = 2
status = -5

# I would like this behaviour with option ipv4
$ getaddrinfo --dgram example.org
query:
  nodename = example.org
  servname = (null)
  socktype = 2
status = 0
#0:
  family = 2
...

# without no-aaaa
$ getaddrinfo --dgram -6 example.org
query:
  nodename = example.org
  servname = (null)
  family = 10
  socktype = 2
status = 0
#0:
  family = 10
...

Comment 7 Florian Weimer 2023-06-01 10:10:36 UTC
(In reply to Petr Menšík from comment #6)
> (In reply to Florian Weimer from comment #5)
> > If it affects getaddrinfo for all NSS backends (not just nss_dns, but also
> > nss_resolve and so on), then it really should go into gai.conf.
> 
> I think I have clarified that already. This should affect dns backend
> *only*. It should not affect hosts database or mdns or other alternatives.
> Because quite often they are able to provide useful and working link-local
> addresses, which would be still working. If other alternatives wants to
> limit their queries too, it should be configurable per backend. But this
> should be nss_dns backend change only.

Oh. Then it's going to be largely a no-op on Fedora because Fedora does not use nss_dns by default.

> What I don't like on current no-aaaa implementation is it overrides even
> explicit request to get AF_INET6 address. That is what I would not want to.
> I want to modify just AF_UNSPEC behaviour and keep AF_INET and AF_INET6
> queries unmodified. But no-aaaa for AF_UNSPEC is what I would like it to
> behave for ipv4 option.

And only for nss_dns. This is a somewhat unusual use case, I think, and not something that's possible to support using the current interfaces.

Comment 8 Petr Menšík 2023-06-01 10:50:37 UTC
I know resolve nss plugin would catch it first. But unfortunately systemd-resolved is not flawless and gets disabled by not a small number of people, even when it is enabled in default installation. This would improve running with alternative caches like dnsmasq or unbound, which provide more predictable experience IMO. I maintain dnsmasq and unbound, which I consider higher quality caches. They depend on glibc client side and never know what AF were used by clients calls.

I think the primary use case it to limit queries made by browsers and tools, which usually use just hints.ai_family == AF_UNSPEC or NULL hints.

Example of that is:
$ curl http://example.org

Some tools allow request to use specific address family:
$ curl -4 http://example.org

If the user requests IPv6 explicitly by using:
$ curl -6 http://example.org

I do not want to refuse to ask for such name. It quite likely won't work without route, but let him get the address. For example by:
$ ping -c1 -6 example.org

I think this is pretty standard expectation and there is nothing unusual on that. Unless the program requests it explicitly, offer try just addresses which will likely work.
Only DNS is changed to keep connecting to ::1 when I ping -c1 localhost, not different address. That should minimize regressions or unwanted changes.

My expectation is that it does not lie. If you ask for IPv6 query explicitly, it has to tell you. If you ask for IPv4, it would too. Only if you ask nothing specific, it would serve what will likely work. It does not tell you IPv6 does not exist when it does, but just does not ask for it unless told explicitly to do so.

Comment 9 Petr Menšík 2023-06-10 18:00:36 UTC
Found a reference on https://github.com/crossdistro/netresolve to ancient bugs related to this request.

Upstream bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=12377

Bug #844921 were filled years ago and never solved at upstream nor at Fedora. I think AI_ADDRCONFIG failed to become used and sane defaults, because it requires adding extra CPU work before every name lookup. Therefore performance seeking browsers avoid using that, as well as many tools do not use NULL hints in getaddrinfo().

Comment 10 Jamie Bainbridge 2023-06-11 03:31:36 UTC
We have a 10 year history of the same RFE from many RHEL customers:

 Red Hat 1027452 - glibc: [RFE] Provide mechanism to disable AAAA queries when using AF_UNSPEC on IPv4-only configurations.
 https://bugzilla.redhat.com/show_bug.cgi?id=1027452

 Sourceware 19697 - /etc/gai.conf option to configure AF_UNSPEC lookups 
 https://sourceware.org/bugzilla/show_bug.cgi?id=19697

Comment 11 Petr Menšík 2023-06-12 15:02:32 UTC
Just made an upstream bugzilla with copy of my request:
https://sourceware.org/bugzilla/show_bug.cgi?id=30544

Comment 12 Petr Menšík 2023-07-04 19:02:57 UTC
I have made attempt to implement this feature the way I wanted that. Attached patch to upstream bug, created also copr build on [1].
On the first glance it seems it works as I expected. Testing with getent ahosts{,v4,v6} gives expected results.

I expect to get such change accepted, it would require also decent unit test. I am not sure how to archieve that, since it should modify answers from DNS server. Probably modification of no-aaaa tests should be a way to go. Would someone mind reviewing what I have already? Should I try to send patch directly to libc-alpha mailing list for a review?

Pushed the code to glibc fork at github [2].

1. https://copr.fedorainfracloud.org/coprs/pemensik/glibc/package/glibc/
2. https://github.com/InfrastructureServices/glibc/tree/gai-ipv4-ipv6


Note You need to log in before you can comment on or make changes to this bug.