175499 – ARP sent with wrong source address

Bug 175499 - ARP sent with wrong source address

Summary: ARP sent with wrong source address

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	iproute
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Marcela Mašláňová
QA Contact:	Brock Organ
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-12-11 23:44 UTC by Geoff Kingsmill
Modified:	2007-11-30 22:07 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-10-22 10:31:57 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Geoff Kingsmill 2005-12-11 23:44:59 UTC

I am investigating a problem where network access to a machine with
multiple interfaces are intermittently failing. 
 
There appears to be a network ARP problem in Linux when a packet
arrives on one interface/subnet and is returned on a different 
interface/subnet. In this configuration, when Linux needs to send
out an ARP request, the ARP packet contains the correct Source MAC 
address of the interface which sent the ARP request however it sends
out the wrong Source IP address. Instead of sending the IP address
of interface which transmitted the ARP packet, it actually sends
the IP address of the interface which received the incoming packet.
The CISCO router does not respond to the ARP request due to the
fact that the Source IP address is on a different subnet.


The follow simplified network diagram illustrates the problem.
 
+------------------------------------------------------------------+
|                        CISCO Router                              |
|                         10.20.x.x/24                             | 
|                                                                  |
|    101.253                                 102.253     105.253   |
+-------+---------------------------------------+-----------+------+
        |                                       |           |
        |                                       |           |
        |                                       |           |
+-------+-------+                       +-------+-----------+------+
|    101.1      |                       |    102.245     105.245   |
|               |                       |      eth0        eth1    |
|    HostA      |                       |           HostB          |
+---------------+                       +-------+-----+-----+------+
                                                |     |     |
                                                |     |     |
                                              other interfaces
      
-- HostA pings HostB 105.245 
 
[HostA]# ping 10.20.105.245
PING 10.20.105.245 (10.20.105.245) 56(84) bytes of data.
<hangs here>
 
 
-- The network interfaces on HostB
 
[HostB]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:11:0A:53:7A:3A
          inet addr:10.20.102.245  Bcast:10.20.102.255  Mask:255.255.255.0
 
eth1      Link encap:Ethernet  HWaddr 00:11:0A:53:7A:3B
          inet addr:10.20.105.245  Bcast:10.20.105.255  Mask:255.255.255.0
 
eth2      Link encap:Ethernet  HWaddr 00:11:85:66:E7:5D
          inet addr:10.20.202.245  Bcast:10.20.202.255  Mask:255.255.255.0
 
eth3      Link encap:Ethernet  HWaddr 00:11:85:66:E7:5C
          inet addr:10.20.205.245  Bcast:10.20.205.255  Mask:255.255.255.0
 
eth4      Link encap:Ethernet  HWaddr 00:00:80:22:89:BC
          inet addr:10.20.15.5  Bcast:10.20.15.255  Mask:255.255.255.0
 
eth5      Link encap:Ethernet  HWaddr 00:02:A5:45:00:BA
          inet addr:10.20.104.201  Bcast:10.20.104.255  Mask:255.255.255.0
 
-- The routing table on HostB shows that packets destined for the 
-- 101 subnet will be directed to the eth0 102 subnet gateway.
 
[HostB]# netstat -rn
Kernel IP routing table
Destination    Gateway         Genmask        Flags MSS Window  irtt Iface
10.21.121.100 10.20.205.253  255.255.255.255 UGH   0 0          0 eth3
10.20.15.0     0.0.0.0        255.255.255.0   U     0 0          0 eth4
10.20.201.0    10.20.202.253  255.255.255.0   UG    0 0          0 eth2
10.20.203.0    10.20.202.253  255.255.255.0   UG    0 0          0 eth2
10.20.202.0    0.0.0.0        255.255.255.0   U     0 0          0 eth2
10.20.205.0    0.0.0.0        255.255.255.0   U     0 0          0 eth3
10.20.104.0    0.0.0.0        255.255.255.0   U     0 0          0 eth5
10.20.204.0    10.20.205.253  255.255.255.0   UG    0 0          0 eth3
10.20.105.0    0.0.0.0        255.255.255.0   U     0 0          0 eth1
10.20.223.0    10.20.205.253  255.255.255.0   UG    0 0          0 eth3
10.20.222.0    10.20.202.253  255.255.255.0   UG    0 0          0 eth2
10.20.101.0    10.20.102.253  255.255.255.0   UG    0 0          0 eth0
10.20.102.0    0.0.0.0        255.255.255.0   U     0 0          0 eth0
10.20.103.0    10.20.102.253  255.255.255.0   UG    0 0          0 eth0
169.254.0.0    0.0.0.0        255.255.0.0     U     0 0          0 eth5
0.0.0.0        10.20.15.253   0.0.0.0         UG    0 0          0 eth4
[HostB]#
 
-- During the failure HostB does not have a MAC address for the 102.253
-- gateway
 
[HostB]# arp -a | grep 102
t2cat1-h102 (10.20.102.253) at <incomplete> on eth0
[HostB]#
 
-- If on HostB I then ping the 102.253 gateway the ARP request completes
-- and the ping on HostA starts working
 
[HostB]# ping -c 5 10.20.102.253
PING 10.20.102.253 (10.20.102.253) 56(84) bytes of data.
From 10.20.102.245 icmp_seq=0 Destination Host Unreachable
From 10.20.102.245 icmp_seq=3 Destination Host Unreachable
64 bytes from 10.20.102.253: icmp_seq=4 ttl=255 time=2.00 ms
 
--- 10.20.102.253 ping statistics ---
5 packets transmitted, 1 received, +2 errors, 80% packet loss, time 4001ms
rtt min/avg/max/mdev = 2.005/2.005/2.005/0.000 ms, pipe 2
[HostB]#
 
-- The ping on HostB to 102.253 triggered an ARP request which worked 
-- as expected.
 
[HostB]# arp -a | grep 102
t2cat1-h102 (10.20.102.253) at 00:00:0C:07:AC:66 [ether] on eth0
[HostB]#
 
-- When the ARP entry times out , the ping on HostA again fails.
 
[HostB]# arp -a | grep 102
t2cat1-h102 (10.20.102.253) at <incomplete> on eth0
[HostB]#
 
-- So the question is why doesn't the ping in HostA cause the ARP entry on 
-- HostB to be refreshed.
 
-- A network trace on the 102 subnet shows that HostB sent out an ARP 
-- request with the correct "Target IP address" and "Sender MAC address"
-- but with the WRONG "Sender IP address". The ARP packet contained the
-- "Sender IP address" of the 105.245 receiving interface rather that the 
-- 102.245 IP address of the returning interface. The CISCO router ignores
-- the ARP request as the returning IP address is on a different subnet.
 
Ethereal Packet Capture
No. Time     Source            Destination Protocol Info
140 0.129057 HewlettP_53:7a:3a Broadcast   ARP      Who has 10.20.102.253?  
                                                    Tell 10.20.105.245
Address Resolution Protocol (request)
    Hardware type: Ethernet (0x0001)
    Protocol type: IP (0x0800)
    Hardware size: 6
    Protocol size: 4
    Opcode: request (0x0001)
    Sender MAC address: HewlettP_53:7a:3a (00:11:0a:53:7a:3a)
    Sender IP address: 10.20.105.245 (10.20.105.245)
                       ^^^^^^^^^^^^^^ WRONG - needs to be 10.20.102.245
    Target MAC address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Target IP address: 10.20.102.253 (10.20.102.253)
 
-- If I define a 101.0 network route to go via the 102.245 interface but
-- also specify a src address of 102.245 then I still see the ARP request  
-- sent with a Sender IP address of the receiving interface 105.245 rather 
-- than the 102.245 IP address of either the interface which sent out the ARP 
-- request or the source address defined on the route.
 
[HostB]# ip route add 10.20.101.0/24 via 10.20.102.253 \
                           dev eth0 src 10.20.102.245
[HostB]# ip route show
10.21.121.100 via 10.20.205.253 dev eth3
10.20.15.0/24 dev eth4  proto kernel  scope link  src 10.20.15.5
10.20.201.0/24 via 10.20.202.253 dev eth2
10.20.203.0/24 via 10.20.202.253 dev eth2
10.20.202.0/24 dev eth2  proto kernel  scope link  src 10.20.202.245
10.20.205.0/24 dev eth3  proto kernel  scope link  src 10.20.205.245
10.20.104.0/24 dev eth5  proto kernel  scope link  src 10.20.104.201
10.20.105.0/24 dev eth1  proto kernel  scope link  src 10.20.105.245
10.20.204.0/24 via 10.20.205.253 dev eth3
10.20.223.0/24 via 10.20.205.253 dev eth3
10.20.222.0/24 via 10.20.202.253 dev eth2
10.20.101.0/24 via 10.20.102.253 dev eth0  src 10.20.102.245
                                           ^^^^^^^^^^^^^^^^^
10.20.102.0/24 dev eth0  proto kernel  scope link  src 10.20.102.245
10.20.103.0/24 via 10.20.102.253 dev eth0
169.254.0.0/16 dev eth5  scope link
default via 10.20.15.253 dev eth4
[HostB]#
 
-- This is on a RedHat WS4-U1 system
 
[HostB]# lsb_release -a
LSB Version:    1.3
Distributor ID: RedHatEnterpriseWS
Description:    Red Hat Enterprise Linux WS release 4 (Nahant Update 1)
Release:        4
Codename:       NahantUpdate1
[HostB]#

 I looked at the tuning parameter arp_filter and rp_filter in 
/proc/sys/net/ipv4/config/*/ but this made no difference and appears
to related to how incoming arp requests are handled rather than changing
the behaviour of outgoing ARP requests.

Is this a bug in ARP or is there a configuration parameter to change
the current behaviour.

Thanks,
Geoff Kingsmill

Comment 1 Geoff Kingsmill 2005-12-12 22:10:47 UTC

I should also mention that I can work around the problem by adding a permanent
arp entry or by using arptables (although this machine is WS which does not
include arptables).

Comment 2 Radek Vokál 2006-02-03 14:28:53 UTC

I've tried to reproduce this issue but with no luck. This is more a question for
our support team. I guess you're subscriber so if you can please contact someone
from http://www.redhat.com/apps/support/, they'll be glad to help you.

Note You need to log in before you can comment on or make changes to this bug.