From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021126 Description of problem: When using "ping -I" on a secondary interface (eth0:1 in this example), it doesn't use the interface's IP address as source address as it should be. From the man page: -I interface address Set source address to specified interface address. Argument may be numeric IP address or name of device. When pinging IPv6 link-local address this option is required. Short: "numeric IP address" works, "name of device" does not. (Real, live) Example: # # Note: Both 194.120.248.64/29 and 172.19.84.0/24 # # are being masqueraded to the Internet on 194.120.248.65. # # "193.141.40.1" is just a nameserver (KPNQwest) # ifconfig eth0 194.120.248.66 netmask 255.255.255.248 broadcast 194.120.248.71 up # route add default gw 194.120.248.65 # ifconfig eth0:1 172.19.84.66 netmask 255.255.255.0 broadcast 172.19.84.255 up # ping -c 1 193.141.40.1 PING 193.141.40.1 (193.141.40.1) from 194.120.248.66 : 56(84) bytes of data. 64 bytes from 193.141.40.1: icmp_seq=1 ttl=244 time=45.2 ms # # WORKS (eth0 address) # ping -c 1 -I eth0 193.141.40.1 PING 193.141.40.1 (193.141.40.1) from 194.120.248.66 eth0: 56(84) bytes of data. 64 bytes from 193.141.40.1: icmp_seq=1 ttl=244 time=41.1 ms # # WORKS (eth0 address) # ping -c 1 -I 172.19.84.66 193.141.40.1 PING 193.141.40.1 (193.141.40.1) from 172.19.84.66 : 56(84) bytes of data. 64 bytes from 193.141.40.1: icmp_seq=1 ttl=245 time=46.1 ms # # WORKS (eth0:1 address, given "by hand" and IP) # ping -c 1 -I eth0:1 193.141.40.1 PING 193.141.40.1 (193.141.40.1) from 194.120.248.66 eth0:1: 56(84) bytes of data. 64 bytes from 193.141.40.1: icmp_seq=1 ttl=244 time=40.3 ms # # DOESN'T WORK! Although it claims it is using eth0:1's # # as source IP, but already the "from <IP>" shows that it # # doesn't do that strace's output confirms that ping really uses 194.120.248.66 (eth0) instead of 172.19.84.66 (eth0:1) to send its data when using "-I eth0:1": # strace ping -c 1 -I eth0:1 193.141.40.1 [...] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, 0x19 /* SO_??? */, [812151909], 7) = -1 ENODEV (No such device) connect(4, {sin_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("193.141.40.1")}}, 16) = 0 getsockname(4, {sin_family=AF_INET, sin_port=htons(32772), sin_addr=inet_addr("194.120.248.66")}}, [16]) = 0 close(4) = 0 ioctl(3, 0x8933, 0xbffff900) = 0 bind(3, {sin_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("194.120.248.66")}}, 16) = 0 setsockopt(3, SOL_RAW, ICMP_FILTER, ~(ICMP_ECHOREPLY|ICMP_DEST_UNREACH|ICMP_SOURCE_QUENCH|ICMP_REDIRECT|ICMP_TIME_EXCEEDED|ICMP_PARAMETERPROB), 4) = 0 setsockopt(3, SOL_IP, IP_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [324], 4) = 0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [131070], [4]) = 0 [...] sendmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("193.141.40.1")}}, msg_iov(1)=[{"\10\0\315\320\243P\1\0\201\250 >\370\364\0\0\10\t\n\v\f"..., 64}], msg_controllen=24, msg_control=0x804f8d8, , msg_flags=0}, 0) = 64 [...] Please note the ENODEV above generated by the first "setsockopt" on the socket. I don't know if this could be the cause of the problem. eth0:1 is up and running: # ifconfig eth0:1 eth0:1 Link encap:Ethernet HWaddr 00:E0:7D:8E:48:D2 inet addr:172.19.84.66 Bcast:172.19.84.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x5000 So there's something broken here with "-I". Tested with: iputils-20001110-5 on RH 7.3 (2.4.19-pre8 glibc-2.2.4-27) iputils-20020124-3 on RH 7.3 (2.4.19-pre8 glibc-2.2.4-27) iputils-20020124-8 on RH 8.0 (2.4.18-19.8.0 glibc-2.2.93-5) -20001110 just barks that it cannot find "eth0:1" as an interface. Both 2002 packages trigger the bug and use the wrong source IP address. Should be easy to reproduce. If you need further information about my configurations, don't hesitate to contact me. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. ifconfig eth0 192.168.0.2 up ifconfig eth0:1 172.16.0.2 up route add default gw 192.168.0.1 2. ping -I eth0:1 193.141.40.1 # or something other external Actual Results: Source address of eth0 (NOT eth0:1) will be used (confirm using strace). Expected Results: Source address of eth0:1 should be used Additional info: Works if you replace "eth0:1" with its actual IP address. But I don't think this is the way -I is supposed to work.
Please note that 194.120.248.64/29 isn't really private IP space. It was disconnected, so before renumbering my network I masqueraded it. In the process of renumbering to 172.16.x.x, I triggered this bug while testing.
Another two things, sorry. :-) I just checked iputils-ss020927.tar.gz and iputils-ss021109-try.tar.bz2 from ftp://ftp.inr.ac.ru/ip-routing/ - Same problem. But maybe it's still interesting for RedHat to update the iputils package, since, for example, the latest "RELNOTES" reads: "[021108] * Noah L. Meyerhans <frodo> Wow. == instead = in traceroute6." Ahem. :-) Routing on my box is okay: Destination Gateway Genmask Flags Metric Ref Use Iface 194.120.248.64 0.0.0.0 255.255.255.248 U 0 0 0 eth0 172.19.84.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 194.120.248.65 0.0.0.0 UG 0 0 0 eth0 (Note that /sbin/route (net-tools[*]) lists eth0:1's address as belonging to eth0, too. This might be correct since it's the same physical NIC, but maybe also worth a fix to reduce possible confusion. [*] net-tools-1.60-7 on RH 8.0 and net-tools-1.60-4 on RH 7.3
After unsuccessful attempts with recent RedHat strace packages and the latest strace available in a tarball, I just fetched CVS strace and worked out a little patch to decode those SO_??? calls that are visible in the above strace() output. Using my patched strace, it turned out that it's indead a problem with ping(8) trying to bind to an interface it cannot reach - for whatever reason. But it doesn't display any error at all and silently goes on - with the wrong IP bound to... In strace, it now looks like this (note the SO_BINDTODEVICE call and that the newer strace(8) shows the actual string "eth0:1\0" instead of a memory pointer): socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3 getuid32() = 0 setuid32(0) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0:1\0", 7) = -1 ENODEV (No such device) connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("193.141.40.1")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(32773), sin_addr=inet_addr("194.120.248.66")}, [16]) = 0 close(4) = 0 ioctl(3, 0x8933, 0xbffff8e0) = 0 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("194.120.248.66")}, 16) = 0 setsockopt(3, SOL_RAW, ICMP_FILTER, ~(ICMP_ECHOREPLY|ICMP_DEST_UNREACH|ICMP_SOURCE_QUENCH|ICMP_REDIRECT|ICMP_TIME_EXCEEDED|ICMP_PARAMETERPROB), 4) = 0 setsockopt(3, SOL_IP, IP_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [324], 4) = 0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [131070], [4]) = 0 [...] The code in question in iputils is in iputils/ping.c around line 275: if (setsockopt(probe_fd, SOL_SOCKET, SO_BINDTODEVICE, device, strlen(device)+1) == -1) { if (IN_MULTICAST(ntohl(dst.sin_addr.s_addr))) { if (ioctl(probe_fd, SIOCGIFINDEX, &ifr) < 0) { fprintf(stderr, "ping: unknown iface %s\n", device); exit(2); } /* ... */ } } No "else", nothing. So, if the interface binding *fails* and the source IP address is *not* a multicast address, no error is issued and ping(8) silently continues. However, I don't have the slightest idea why a (root) process should not be allowed to bind to a subinterface like "eth0:1". Also I'm not at all sure whether the IN_MULTICAST check is in the right place here. Seems kind of wrong to me. However, I'll attach my little patch which at least outputs a message and terminates if the interface specified is unusable for the task. With this patch, ping barks if used on things like eth0:1 where it cannot bind to; it doesn't bark if it just doesn't have enough permissions. However, the message is also displayed if a _really_ non-existent interface is specified (NOTE: There is a difference here between "eth0:1" and, for example, "bla0". Different OS reaction!). The problem remains. Any ideas?
Created attachment 89313 [details] Patch that adds error reporting to ping(8) if it cannot bind to an EXISTING interface Against iputils-ss021109; should be easy to apply.
Patch included in iputils-20020927-4 to appear in rawhide soon. Read ya, Phil