Bug 1330973
Summary: | getaddrinfo hangs for 25 seconds if ipv6 is disabled | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Krzysztof Pawłowski <krzysztof.pawlowski> |
Component: | systemd | Assignee: | systemd-maint |
Status: | CLOSED ERRATA | QA Contact: | Robin Hack <rhack> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.3 | CC: | ashankar, bblaskov, dkochuka, dmoessne, fweimer, jsynacek, krzysztof.pawlowski, mnewsome, pfrankli, rhack, systemd-maint-list |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | systemd-219-22.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-04 00:53:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Krzysztof Pawłowski
2016-04-27 11:37:32 UTC
I can reproduce this with the instructions provided. The hang is actually in nss_myhostname, which is provided by systemd: #0 0x00007ffff6e1aca9 in ppoll () from /lib64/libc.so.6 #1 0x00007ffff7e5d1af in sd_rtnl_call.constprop.10 () from /lib64/libnss_myhostname.so.2 #2 0x00007ffff7e5ef72 in local_addresses.constprop.4 () from /lib64/libnss_myhostname.so.2 #3 0x00007ffff7e60717 in _nss_myhostname_gethostbyname3_r () from /lib64/libnss_myhostname.so.2 #4 0x00007ffff7e609e4 in _nss_myhostname_gethostbyname2_r () from /lib64/libnss_myhostname.so.2 #5 0x00007ffff6e0b612 in gaih_inet () from /lib64/libc.so.6 #6 0x00007ffff6e0e86d in getaddrinfo () from /lib64/libc.so.6 Krzysztof, you said that the problem appeared after a glibc upgrade. I tried downgrading to glibc-2.17-106.el7_2.1.x86_64, and the problem persist. Did you pick up a systemd update at the same time as you upgraded glibc? Thanks. I did only: yum update glibc Systemd version on broken host: systemd-libs-219-19.el7.x86_64 systemd-219-19.el7.x86_64 systemd-python-219-19.el7.x86_64 systemd-sysv-219-19.el7.x86_64 Systemd version on good host: systemd-sysv-208-20.el7_1.5.x86_64 systemd-208-20.el7_1.5.x86_64 systemd-python-208-20.el7_1.5.x86_64 systemd-libs-208-20.el7_1.5.x86_64 I have only upgraded glibc on good host: glibc-2.17-106.el7_2.4.x86_64 glibc-common-2.17-106.el7_2.4.x86_64 And problem didn't appear. I have upgraded systemd to the same latest version as on broken host and problem appeared. I've checked another thing. I've upgraded also kernel pkg and it has systemd dep. So i've made another test. I've upgraded only systemd without glibc and problem appeared. So You are right that the real source of problem is inside systemd. (In reply to Krzysztof Pawłowski from comment #2) > So You are right that the real source of problem is inside systemd. Thanks, reassigning to systemd. Just to be completely sure, can you try to remove nss_myhostname from /etc/nsswitch.conf? I've change line in /etc/nsswitch.conf: From: hosts: files dns myhostname To: hosts: files dns And then result is instant: time python -c 'import socket; print socket.getaddrinfo("'`hostname -f`'", None, socket.AF_INET6)' hostname: Name or service not known Traceback (most recent call last): File "<string>", line 1, in <module> socket.gaierror: [Errno -2] Name or service not known real 0m0.071s user 0m0.041s sys 0m0.021s Any chances for fix ? I debugged this a bit and I wonder... Why does _nss_myhostname_gethostbyname3_r() get af=10 as its argument? The 10 means PF_INET6 (AF_INET6). Why is glibc passing that when it knows that IPv6 is disabled? (In reply to Jan Synacek from comment #8) > I debugged this a bit and I wonder... Why does > _nss_myhostname_gethostbyname3_r() get af=10 as its argument? The 10 means > PF_INET6 (AF_INET6). Why is glibc passing that when it knows that IPv6 is > disabled? The reproducer in comment 6 explicitly asks for an IPv6 address, so glibc tries to obtain it. Oh, my bad. I forgot to mention that, as a reproducer, I simply used 'getent hosts $(hostname -f)', which triggers the timeout as well. (In reply to Jan Synacek from comment #10) > Oh, my bad. I forgot to mention that, as a reproducer, I simply used 'getent > hosts $(hostname -f)', which triggers the timeout as well. getent performs an AF_INET6 lookup followed by an AF_INET lookup, so that's not surprising at all. QA acking. pushed to staging -> https://github.com/lnykryn/systemd-rhel/commit/6e5117b83af5998359916f276a9b32f755c0e6f4 -> post Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2216.html |