Bug 1027452 - glibc: [RFE] Provide mechanism to disable AAAA queries when using AF_UNSPEC on IPv4-only configurations.
Summary: glibc: [RFE] Provide mechanism to disable AAAA queries when using AF_UNSPEC o...
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc   
(Show other bugs)
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 7.6
Assignee: glibc team
QA Contact: qe-baseos-tools
URL:
Whiteboard:
Keywords: FutureFeature, Reopened
Depends On:
Blocks: 1594286
TreeView+ depends on / blocked
 
Reported: 2013-11-06 21:27 UTC by Michal Bruncko
Modified: 2019-04-03 11:16 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-06-22 17:58:42 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Strace for telnet command (19.25 KB, text/plain)
2013-11-11 14:49 UTC, Michal Bruncko
no flags Details
Packet capture from telnet command (1.68 KB, application/octet-stream)
2013-11-11 14:52 UTC, Michal Bruncko
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 15863 None None None 2017-08-28 20:49 UTC

Description Michal Bruncko 2013-11-06 21:27:34 UTC
Description of problem:
Maybe this is stupid question and this bug will be closed immediately, but I need to ask/report. My system is completely IPv6 disabled* (I have no IPv6 address assigned to any NW interface at all). Why the resolver all the time requesting AAAA records? It seems that IPv6 address will not be used as the IPv6 is completely disabled. Is there any reason for that? 


*:
in /etc/sysctl.conf  :  net.ipv6.conf.all.disable_ipv6 = 1
in /etc/sysconfig/network  : NETWORKING_IPV6=no
in /etc/sysconfig/network-scripts/ifcfg-eth0 : IPV6INIT=”no”


Version-Release number of selected component (if applicable):
glibc-2.12-1.107.el6_4.5.x86_64

How reproducible:
always

Steps to Reproduce:
1. disable IPv6 completely within system
2. try to resolve any DNS name (or simply use ping toward some DNS name)

Actual results:
resolver will ask for both A and AAAA records

Expected results:
resolver will ask for both A record only

Additional info:
Yes, I understand that IPv6 is the feature protocol and soon or later will replace IPv4 completely. I personally implemented and use IPv6 in one organziation and I am happy with it. Please consider this report as pure explanational for me and for all other people with same question.

Comment 2 Carlos O'Donell 2013-11-08 03:33:21 UTC
Thanks for the report. Which application is making the request, and how did you determine that it was the glibc resolver that made the request? Can you provide a step-by-step set of actions to reproduce the issue including any tcpdump logs?

Comment 3 Michal Bruncko 2013-11-11 14:46:32 UTC
The original problem was that the webpage "www.shellcardonline.shell.com" was not reachable via Squid proxy server - the loading stucked on "https://www.shellcardonline.shell.com/authenticateusertoken.aspx". When the computer reach the site directly without proxy. So I started to investigate why this happen and I found, that AAAA name resolution is timing out for "www.shellcardonline.extha.shell.com" - which is CNAME of "www.shellcardonline.shell.com" and for "www-cardauth-services-prd.extha.shell.com" which is CNAME of "www.shellcardonline.extha.shell.com" (example: http://www.dnswatch.info/dns/dnslookup?la=en&host=www-cardauth-services-prd.extha.shell.com&type=AAAA&submit=Resolve).

So I have disabled IPv6 completely as this Proxy server does not have IPv6 connectivity and I hoped that AAAA requets will stop raising at all. But AAAA resolving remains same even if I disabled IPv6 on the host. 

> Which application is making the request?
It is squid (as proxy server) and telnet (wanted to replicate connection establishing). For both cases both records where requested A and AAAA. 

> how did you determine that it was the glibc resolver that made the request?
Attaching strace of telnet output. As I can see the "/lib64/libresolv.so.2" is loaded to handle DNS reqests. I think this is same for Squid process as well.

Comment 4 Michal Bruncko 2013-11-11 14:49:54 UTC
Created attachment 822445 [details]
Strace for telnet command

Comment 5 Michal Bruncko 2013-11-11 14:52:41 UTC
Created attachment 822446 [details]
Packet capture from telnet command

it was provided from different computer, please ignore IP address differences from previous strace output. 
As you can see here the delay between executing "telnet www.shellcardonline.shell.com 443" and getting "Connected..." response is 15 seconds.

Comment 6 Siddhesh Poyarekar 2013-11-18 09:42:56 UTC
(In reply to Michal Bruncko from comment #3)
> So I have disabled IPv6 completely as this Proxy server does not have IPv6
> connectivity and I hoped that AAAA requets will stop raising at all. But
> AAAA resolving remains same even if I disabled IPv6 on the host. 

I don't think we have a mechanism in place to disable AAAA lookups in glibc via a configuration - a program that makes an AF_UNSPEC or AF_INET6 request will get IPv6 results if the nameserver supports it.  Disabling IPv6 networking is something very different - it simply disables IPv6 support in the kernel and prevents the relevant network interfaces from being created.  It does not result in disabling IPv6 name lookups.  Maybe there should be a feature request for this.  This is probably another good use case for tunables.

Comment 7 Michal Bruncko 2013-11-20 12:33:51 UTC
Hi Siddesh,
yes exactly, such option ("inet4only") is missing in this situation. in current case resolver is asking for both records (==two queries) even if there are scenarios where the AAAA records are not necessary and this doubles every name resolution which also can increase delay for waiting for responses from both requests (like in reported example). 
and this is what I have tried to discuss here. this option will not be "feature", but desired option for "legacy" servers on IPv4 networks only.

Comment 11 Carlos O'Donell 2014-06-04 04:30:42 UTC
Unfortunately the solution we were expecting to use to solve this issue has been shown to violate the POSIX standard wording for getaddrinfo. Therefore we have had to change the implementation plan. That places this solution outside the scope of rhel-6.6. I have moved this bug to rhel-6.7. In the meantime we will be working on an upstream solution to attempt to provide a glibc tunnable to completely disable the ipv6 queries (orthogonal to the usage of AI_ADDRCONFIG). Such a tunable could be used to prevent AAAA queries from being issued by the glibc stub resolver when AF_UNSPEC queries are made, regardless of the state of the interfaces.

Comment 13 ozzzo 2016-06-05 02:41:48 UTC
Was this bug ever fixed? I am still seeing the unwanted AAAA queries in Centos 6.7 and 7.2 even after disabling ipv6.

Comment 14 Carlos O'Donell 2016-06-06 15:08:05 UTC
(In reply to ozzzo from comment #13)
> Was this bug ever fixed? I am still seeing the unwanted AAAA queries in
> Centos 6.7 and 7.2 even after disabling ipv6.

Thank you for your inquiry. This bug is not fixed in upstream, and is not fixed in RHEL6 or RHEL7 yet.

Comment 15 Chris Williams 2016-08-08 21:19:11 UTC
When Red Hat shipped 6.8 on May 10, 2016 RHEL 6 entered Production Phase 2. 
https://access.redhat.com/support/policy/updates/errata#Production_2_Phase
That means only "Critical and Important Security errata advisories (RHSAs) and Urgent Priority Bug Fix errata advisories (RHBAs) may be released"
That also means no new RFEs so this BZ is being moved to RHEL 7 which is still in Production Phase 1.

Comment 18 Rodrigo A B Freire 2017-08-28 20:43:47 UTC
1. Proposed title of this feature request
  * [RFE] glibc: implement GAI modifier for AAAA? DNS queries

2. Who is the customer behind the request?
Account name: Confidential
SRM customer: Confidential
TAM customer: Confidential
Strategic Customer: Confidential
 
3. What is the nature and description of the request?
  * Currently, in order to be fully adherent with RFC 2553, getaddrinfo() performs both a AAAA (IPv6 address query) query and a A (IPv4 address query) query to its DNS server.
  * getaddrinfo() is full-blocking, meaning: the getaddrinfo() function will not return before it gets either a reply or a timeout for each of the A and AAAA queries.
  * This request aims to add a new glibc functionality, mitigating potential problem scenarios where the RFC 2553 adherence cause problems.
 
4. Why does the customer need this? (List the business requirements here)
  * There are scenarios where in a AAAA lookup is not wanted / desired and may cause problems. For example:
    - IPv6 stack for local-only traffic
      In this scenario, the client would be connecting to sites that resolves AAAA addresses, but that is not desired
    - DNS servers that does not replies AAAA queries
      In this scenario, we have long getaddrinfo() calls, that returns only after the resolver timeout.  
  * There might exist other non-envisioned scenarios here, benefitting them all.

5. How would the customer like to achieve this? (List the functional requirements here)
  * We would suggest adding a configuration clause to /etc/gai.conf. This flag would change getaddrinfo() behavior and if present, would NOT perform the AAAA queries.

6. For each functional requirement listed in question 5, specify how Red Hat and the customer can test to confirm the requirement is successfully implemented.
  * getaddrinfo() resolves BY DEFAULT both A and AAAA for a host name
  * IF present /etc/gai.conf and getaddrinfo() AAAA modifier clause ; THEN
    * DO NOT try AAAA name resolution
  * FI

7. Is there already an existing RFE upstream or in Red Hat bugzilla?
  * None known.

8. Does the customer have any specific timeline dependencies?
  * Desirable: RHEL 7.6
 
9. Is the sales team involved in this request and do they have any additional input?
  * No Sales knowledge.

10. List any affected packages or components.
  * glibc

11. Would the customer be able to assist in testing this functionality if implemented?
  * Yes.

Comment 27 ozzzo 2018-08-07 01:42:06 UTC
Wow, 5 years later this bug is finally being fixed. I would be ecstatic if I still used RHEL.

Comment 28 Piotr Kierklo 2019-04-03 11:01:06 UTC
Hi, I'm much interested in this bug and implementing the solution to it, on behalf of my organization. Not sure if I can somehow vote for this using my Redhat support contract.

I tested this on Centos 7.6 and the vanilla build still has this issue. Of course I don't know what modifications have to be done to /etc/gai.conf to test disabling IPv6 queries, if it was implemented in 7.6 as targeted.

Also, I have few questions:
Why it cannot be done in /etc/resolv.conf? All other options are going there, so it is the "standard breaking of RFC 2553" that you mentioned in comment above that prevents this option to go in to /etc/resolv.conf? And is /etc/gai.conf not covered under this restriction?

We actually found this to be a problem in our environment, because by default IPv6 and IPv4 queries are done from the same source port. Microsoft DNS server on Windows 2012 and Windows 2016 is not sending the response to the first query that arrives (something to do with Windows OS waiting for the ARP response, we don't know yet), which results in the DNS request to time out. After the retry, it works fine until the ARP entry expires.
Testing on SLES (SUSE Linux) show that they do both DNS request (IPv4 and IPv6) using different source port and avoid the above problem. This is probably equivalent to the "single-request-reopen" that you can set in /etc/resolv.conf. And is probably some default option that SLES compiles into their glibc.
So setting that option could work around it (we tested and it helps in above scenario), but disabling the IPv6 query altogether would be much better solution.

Similar request to this - https://sourceware.org/bugzilla/show_bug.cgi?id=14799

Comment 29 Florian Weimer 2019-04-03 11:16:01 UTC
(In reply to Piotr Kierklo from comment #28)
> Hi, I'm much interested in this bug and implementing the solution to it, on
> behalf of my organization. Not sure if I can somehow vote for this using my
> Redhat support contract.
> 
> I tested this on Centos 7.6 and the vanilla build still has this issue. Of
> course I don't know what modifications have to be done to /etc/gai.conf to
> test disabling IPv6 queries, if it was implemented in 7.6 as targeted.

This feature has not been implemented.  It is not currently targeted for any particular release of Red Hat Enterprise Linux.

> Also, I have few questions:
> Why it cannot be done in /etc/resolv.conf? All other options are going
> there, so it is the "standard breaking of RFC 2553" that you mentioned in
> comment above that prevents this option to go in to /etc/resolv.conf? And is
> /etc/gai.conf not covered under this restriction?

We could offer a configuration knob to get different behavior for AF_UNSPEC, as long as we keep the default behavior as it exists today.

There are actually multiple related RFEs here.  Some people want to suppress all AAAA DNS queries, some want to change AF_UNSPEC to send only A queries.  Others want to filter out all AF_INET6 results from the name lookup results, not just DNS, and not just for getaddrinfo.

It's probably best to continue the discussion on the upstream libc-alpha list.  You can subscribe here:

  <https://sourceware.org/lists.html#ml-requestor>

We will not implement a downstream-only change for this.

> We actually found this to be a problem in our environment, because by
> default IPv6 and IPv4 queries are done from the same source port. Microsoft
> DNS server on Windows 2012 and Windows 2016 is not sending the response to
> the first query that arrives (something to do with Windows OS waiting for
> the ARP response, we don't know yet), which results in the DNS request to
> time out. After the retry, it works fine until the ARP entry expires.
> Testing on SLES (SUSE Linux) show that they do both DNS request (IPv4 and
> IPv6) using different source port and avoid the above problem. This is
> probably equivalent to the "single-request-reopen" that you can set in
> /etc/resolv.conf. And is probably some default option that SLES compiles
> into their glibc.
> So setting that option could work around it (we tested and it helps in above
> scenario), but disabling the IPv6 query altogether would be much better
> solution.

Interesting.  Yes, single-request-reopen should work around this, but this looks to me like a bug in the server.  We do not want to enable single-request-reopen by default because it increases latency.  Changing the code to open two sockets and send those queries in parallel would help, but the code has been proven to be rather resistant to change.


Note You need to log in before you can comment on or make changes to this bug.