Bug 21887 - Can't connect to other network
Summary: Can't connect to other network
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Michael K. Johnson
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2000-12-07 11:57 UTC by Bjorn Karlsen
Modified: 2007-04-18 16:30 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2000-12-18 15:19:24 UTC
Embargoed:


Attachments (Terms of Use)

Description Bjorn Karlsen 2000-12-07 11:57:09 UTC
We have 2 IP-segments. x.y.211.0 and x.y.212.0. My Red Hat 7.0 machine
has IP x.y.212.93.

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
x.y.212.0       0.0.0.0         255.255.255.0   U     0      0        0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U     0      0        0 lo
0.0.0.0         x.y.212.1       0.0.0.0         UG    0      0        0 eth0   

There's a router between the nets (an RPC-module in our Cisco switch).

ALL OTHER machines (Solaris 2.x, Red Hat 6.2) can access everything.

But my Red Hat 7.0 can only access machines on the x.y.212.0 segment:

# ping x.y.212.12
PING x.y.212.12 (x.y.212.12) from x.y.212.93 : 56(84) bytes of data.64 bytes from edh3 (x.y.212.12): icmp_seq=0 ttl=255 time=275 usec
64 bytes from edh3 (x.y.212.12): icmp_seq=1 ttl=255 time=261 usec
64 bytes from edh3 (x.y.212.12): icmp_seq=2 ttl=255 time=224 usec
 
--- x.y.212.12 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/mdev = 0.224/0.253/0.275/0.025 ms
# ping x.y.211.12
PING x.y.211.12 (x.y.211.12) from x.y.212.93 : 56(84) bytes of data.64 bytes from dritern (x.y.211.12): icmp_seq=0 ttl=254 time=771 usec
 
--- x.y.211.12 ping statistics ---
5 packets transmitted, 1 packets received, 80% packet loss
round-trip min/avg/max/mdev = 0.771/0.771/0.771/0.000 ms  

As you can see, only the first ping slips through. All ping packages are
returned from x.y.211.12 (confirmed using Solaris' snoop).
However, if I use ping -R, it works:

# ping -R x.y.211.12
PING x.y.211.12 (x.y.211.12) from x.y.212.93 : 56(124) bytes of data.
64 bytes from dritern (x.y.211.12): icmp_seq=0 ttl=254 time=1.024 msec
NOP
RR:     raymon (x.y.212.93)
        x.y.211.1
        dritern (x.y.211.12)
        triahal (x.y.212.1)
        raymon (x.y.212.93)
 
64 bytes from dritern (x.y.211.12): icmp_seq=1 ttl=254 time=789 usec
NOP     (same route)
64 bytes from dritern (x.y.211.12): icmp_seq=2 ttl=254 time=816 usec
NOP     (same route)
64 bytes from dritern (x.y.211.12): icmp_seq=3 ttl=254 time=849 usec
NOP     (same route)
 
--- x.y.211.12 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/mdev = 0.789/0.869/1.024/0.096 ms   

I've tried to update to latest NIC driver (eepro100), but it didn't help.

Comment 1 Pekka Savola 2000-12-10 21:50:44 UTC
Looks more like kernel issue.

Does 'tcpdump -n' show anything (run on RHL7 box) with pings 2+ ?


Comment 2 Bjorn Karlsen 2000-12-11 08:28:03 UTC
'tcpdump -n' on the RHL7 box:
09:14:12.627112 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:12.627386 eth0 < x.y.211.12 > x.y.212.93: icmp: echo reply (DF)
09:14:13.625249 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:14.625292 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:15.625345 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:16.625422 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:17.625492 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request
09:14:18.625570 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request

'snoop -r' on the Solaris 8 box I ping (snoop is Solaris' version of tcpdump):
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 0)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 0)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 256)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 256)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 512)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 512)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 768)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 768)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1024)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1024)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1280)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1280)
x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1536)
x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1536)

As you can see, the Solaris box is receiving and answering the ping requests.

If I do the same from a RHL6.2 box, everything works:

'tcpdump -n':
09:24:56.733451 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request
09:24:56.733771 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF)
09:24:57.730856 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request
09:24:57.731043 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF)
09:24:58.730871 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request
09:24:58.731061 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF)
09:24:59.730892 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request
09:24:59.731082 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF)

'snoop -r':
x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 0)
x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 0)
x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 256)
x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 256)
x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 512)
x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 512)
x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 768)
x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 768)


Comment 3 Pekka Savola 2000-12-11 08:47:27 UTC
Does it work if you increase the packet size with -s, to e.g. 60 or 100?   If
so,
what's the critical threshold  (this should simulate -R quite closely in aspects 
that count) ?

Have you tried pinging other than solaris boxes, e.g. RHL?

Comment 4 Bjorn Karlsen 2000-12-11 09:03:19 UTC
I tried with packet size 60 and 100 (I even tried 200 and 300), but it did not
help.

I don't have any RHL boxes on x.y.211.0, only Solaris.

Comment 5 Pekka Savola 2000-12-11 09:15:22 UTC
This is with 2.2.16-22 kernel?  Has iputils package on RHL62 been updated to
errata version?
If it isn't, does it still react the same way if you do?  

You could try dumping the whole packet and seeing where it differs from RHL62,
but I don't know
if anything really useful could be found out..

FWIW, I can ping solaris boxes just fine off my RHL7.


Comment 6 Bjorn Karlsen 2000-12-11 10:54:01 UTC
It is the 2.2.16-22 kernel.

iputils on the RHL6.2 box is not updated to errata status. Since it's our fax
server, I can't use it for testing purposes.

There is another RHL7 box on our network which is UPGRADED from RHL6.2. From
that machine I can ping all networks. It has another network card.

Pinging Solaris machines is not the problem. The problem is ping/telnet/rsh
to machines on other subnets.


Comment 7 Pekka Savola 2000-12-11 11:06:04 UTC
Running out of ideas :-/.  Replacing hardware (esp. network card, preferably
with 
one with different drivers), checking configuration (speed, duplexity; shouldn't
be a problem
since connections work locally) or trying a different switch ports might help.


Comment 8 Bjorn Karlsen 2000-12-18 14:13:36 UTC
I have replaced the NIC with an old 3Com 3c900 10Mbps (PCI). I can now connect
(and ping) to all our networks. Seems like a problem with RHL7 and the NIC
using the eepro 100 driver. Can you recommend any 100Mbps PCI NICs for use
with RHL7 (not using the eepro100 driver)?

Comment 9 Pekka Savola 2000-12-18 14:23:39 UTC
3COM's and tulips have done the job well for me, as well as Inter EtherExpress Pro's.  
I haven't noticed any problems with eepro100 + RHL7, myself.

Comment 10 Bjorn Karlsen 2000-12-18 15:02:51 UTC
It has to be something else than the NIC. The 3Com NIC worked for a while, but
now I have the same problem with the 3Com as I had with the eepro100.
Unable to ping (or telnet/rsh) machines on the x.y.211.0 network. ping -R is
working as before...
Even if I reboot the machine, I'm not able to ping other networks...

Comment 11 Pekka Savola 2000-12-18 15:19:21 UTC
FWIW, I've had very weird problems with NIC's once.  The BIOS had some power 
management options on and the card put itself in some weird more after a day.  
This happened twice.  Shutting the system down and pulling the power plug for a 
few minutes (cold restart) helped a bit.  I don't know what it was but turning off all 
power management and changing the NIC helped. It was a 3c905, I think. 
I doubt this is bugging you, but you never know...

HTH.


Comment 12 Bjorn Karlsen 2000-12-20 06:56:55 UTC
I switched to another port on our switch and now everything is working.
The problem with the troublesome port will be handed over to our Cisco dealer.


Note You need to log in before you can comment on or make changes to this bug.