Bug 516179
Summary: | strange ping address resolving behavior | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | J. Randall Owens <jrowens.fedora> |
Component: | iputils | Assignee: | Jiri Skala <jskala> |
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 11 | CC: | aglotov, jskala |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-09-09 09:33:05 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 195271 |
Description
J. Randall Owens
2009-08-07 08:13:53 UTC
Oh, I forgot to include my resolv.conf (minus comments): search ghiapet.net dyn.ghiapet.net sortlist 10.XXX.XXX.0/22 options timeout:3 inet6 nameserver 127.0.0.1 As you can see, I put the dot at the end so the search domains shouldn't enter into it at all (indeed, earlier I used just 'linksys0', and got a six-packet exchange; I used FQDN to shorten the output to paste here). And this behavior long precedes my use of the sortlist. My memory's been refreshed now, that this also happens with ncftp and traceroute, but not telnet (I think). I have another interesting example. When I `ping moon.linux-ipv6.org.`, a host with both IPv4 and IPv6 addresses, it starts pinging at 32.1.2.0. I discovered that in the response packet, the first four bytes of the host's IPv6 address, 2001:200:0:1003:207:e9ff:fe04:9924, come to 32.1.2.0 when rendered decimally. I also notice in the example above that even though it's a IPv4-only ping, you can see the second-to-last pair of octets of the first packet, x001c, it's requesting an AAAA record, instead of an A record (and then acting shocked when it gets one!). The second packet, in reply, then contains SOA information. The third packet is another query, this time for an A record. The fourth packet, the should-be-final response, returns an A record (the part with the Xed out octets), class IN, using message compression to represent linksys0.ghiapet.net. with that xc00c. Then there are a pair of NS records, also using the compression, starting with xc015 to represent the shorter part of the domain, and also ending with the same, to shorten the nameservers' names. The xc042 represents ns1, and xc054 for ns2. It returns an A and a AAAA record for each of these (and absolutely no reason not to; just because the ping is IPv4 only, doesn't mean its resolver is), seen by the x0001 and x001c after the shortened forms. Then, after a bit of other stuff (xe10 is TTL of 1H, x0004 is IPv4 record length, x0010 is IPv6 record length), are the additional NS addresses themselves in each of those RRs. So apparently, in that case, it grabs an IPv4 address from one of the additional records, rather than assuming that the first four octets of an AAAA record are the IPv4 address it wants, like moon.linux-ipv6.org did. These may actually be two separate bugs, neither of which is really in ping. Actually, no, traceroute isn't one that this happens to. So, so far, ping and ncftp definitely get it, traceroute and telnet seem clean. Firefox too. $ foreach i ( /bin/ping /usr/bin/ncftp /bin/traceroute /usr/bin/telnet ) foreach?echo $i foreach?ldd $i | sort foreach?end /bin/ping libc.so.6 => /lib/libc.so.6 (0x00c06000) libidn.so.11 => /lib/libidn.so.11 (0x04162000) /lib/ld-linux.so.2 (0x00be2000) linux-gate.so.1 => (0x00e27000) /usr/bin/ncftp libc.so.6 => /lib/libc.so.6 (0x00c06000) /lib/ld-linux.so.2 (0x00be2000) libresolv.so.2 => /lib/libresolv.so.2 (0x00631000) linux-gate.so.1 => (0x005c1000) /bin/traceroute libc.so.6 => /lib/libc.so.6 (0x00c06000) /lib/ld-linux.so.2 (0x00be2000) libm.so.6 => /lib/libm.so.6 (0x00d79000) linux-gate.so.1 => (0x0014c000) /usr/bin/telnet libc.so.6 => /lib/libc.so.6 (0x00322000) libdl.so.2 => /lib/libdl.so.2 (0x001b1000) /lib/ld-linux.so.2 (0x00c77000) libncurses.so.5 => /lib/libncurses.so.5 (0x00843000) libtinfo.so.5 => /lib/libtinfo.so.5 (0x001e6000) libutil.so.1 => /lib/libutil.so.1 (0x00ddd000) linux-gate.so.1 => (0x00321000) So, ping and ncftp use libidn and libresolv, traceroute seems to do it itself, or else finds something in libc it can use, and telnet, I guess uses one of libdl, libtinfo, or libutil, or else does the resolution itself. OK, I don't really know my C very well, but digging around in source and man pages, it looks like ping (but not ping6) is using the deprecated gethostbyname() function in ping.c line 261, and then doing a memcpy() of the first four bytes of the h_addr member of the returned struct, which is basically an alias for the first item in h_addr_list. This definitely explains the 32.1.2.0 behaviour. I don't think this quite entirely explains the additional-records 127.0.0.1 bug. But I bet if gethostbyname() were replaced by something more current, it'd go away. (And should they be using memcpy() that way, without validation? Like I said, I don't know my C.) I also notice that a quick grep of the iputils source shows that arping, clockdiff, rarpd, tracepath, and traceroute6 (but not traceroute, a totally separate package) all use gethostbyname(), too, and probably have much the same issues. Seems like it was the "options inet6" in resolv.conf that caused this. I'm not sure what the status or component should be at this point. With that option removed, pinging those hosts seems to work as expected now. (ncftp working normally again, too.) I do think these utilities could handle a case of RES_USE_INET6 more gracefully, though. Hi, you are right. Replacing gethostbyname by getaddrinfo fixes the problem. Jiri |