We have 2 IP-segments. x.y.211.0 and x.y.212.0. My Red Hat 7.0 machine has IP x.y.212.93. # route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface x.y.212.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 x.y.212.1 0.0.0.0 UG 0 0 0 eth0 There's a router between the nets (an RPC-module in our Cisco switch). ALL OTHER machines (Solaris 2.x, Red Hat 6.2) can access everything. But my Red Hat 7.0 can only access machines on the x.y.212.0 segment: # ping x.y.212.12 PING x.y.212.12 (x.y.212.12) from x.y.212.93 : 56(84) bytes of data.64 bytes from edh3 (x.y.212.12): icmp_seq=0 ttl=255 time=275 usec 64 bytes from edh3 (x.y.212.12): icmp_seq=1 ttl=255 time=261 usec 64 bytes from edh3 (x.y.212.12): icmp_seq=2 ttl=255 time=224 usec --- x.y.212.12 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max/mdev = 0.224/0.253/0.275/0.025 ms # ping x.y.211.12 PING x.y.211.12 (x.y.211.12) from x.y.212.93 : 56(84) bytes of data.64 bytes from dritern (x.y.211.12): icmp_seq=0 ttl=254 time=771 usec --- x.y.211.12 ping statistics --- 5 packets transmitted, 1 packets received, 80% packet loss round-trip min/avg/max/mdev = 0.771/0.771/0.771/0.000 ms As you can see, only the first ping slips through. All ping packages are returned from x.y.211.12 (confirmed using Solaris' snoop). However, if I use ping -R, it works: # ping -R x.y.211.12 PING x.y.211.12 (x.y.211.12) from x.y.212.93 : 56(124) bytes of data. 64 bytes from dritern (x.y.211.12): icmp_seq=0 ttl=254 time=1.024 msec NOP RR: raymon (x.y.212.93) x.y.211.1 dritern (x.y.211.12) triahal (x.y.212.1) raymon (x.y.212.93) 64 bytes from dritern (x.y.211.12): icmp_seq=1 ttl=254 time=789 usec NOP (same route) 64 bytes from dritern (x.y.211.12): icmp_seq=2 ttl=254 time=816 usec NOP (same route) 64 bytes from dritern (x.y.211.12): icmp_seq=3 ttl=254 time=849 usec NOP (same route) --- x.y.211.12 ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max/mdev = 0.789/0.869/1.024/0.096 ms I've tried to update to latest NIC driver (eepro100), but it didn't help.
Looks more like kernel issue. Does 'tcpdump -n' show anything (run on RHL7 box) with pings 2+ ?
'tcpdump -n' on the RHL7 box: 09:14:12.627112 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:12.627386 eth0 < x.y.211.12 > x.y.212.93: icmp: echo reply (DF) 09:14:13.625249 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:14.625292 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:15.625345 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:16.625422 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:17.625492 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 09:14:18.625570 eth0 > x.y.212.93 > x.y.211.12: icmp: echo request 'snoop -r' on the Solaris 8 box I ping (snoop is Solaris' version of tcpdump): x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 0) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 0) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 256) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 256) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 512) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 512) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 768) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 768) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1024) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1024) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1280) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1280) x.y.212.93 -> x.y.211.12 ICMP Echo request (ID: 62009 Sequence number: 1536) x.y.211.12 -> x.y.212.93 ICMP Echo reply (ID: 62009 Sequence number: 1536) As you can see, the Solaris box is receiving and answering the ping requests. If I do the same from a RHL6.2 box, everything works: 'tcpdump -n': 09:24:56.733451 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request 09:24:56.733771 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF) 09:24:57.730856 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request 09:24:57.731043 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF) 09:24:58.730871 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request 09:24:58.731061 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF) 09:24:59.730892 eth0 > x.y.212.91 > x.y.211.12: icmp: echo request 09:24:59.731082 eth0 < x.y.211.12 > x.y.212.91: icmp: echo reply (DF) 'snoop -r': x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 0) x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 0) x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 256) x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 256) x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 512) x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 512) x.y.212.91 -> x.y.211.12 ICMP Echo request (ID: 38761 Sequence number: 768) x.y.211.12 -> x.y.212.91 ICMP Echo reply (ID: 38761 Sequence number: 768)
Does it work if you increase the packet size with -s, to e.g. 60 or 100? If so, what's the critical threshold (this should simulate -R quite closely in aspects that count) ? Have you tried pinging other than solaris boxes, e.g. RHL?
I tried with packet size 60 and 100 (I even tried 200 and 300), but it did not help. I don't have any RHL boxes on x.y.211.0, only Solaris.
This is with 2.2.16-22 kernel? Has iputils package on RHL62 been updated to errata version? If it isn't, does it still react the same way if you do? You could try dumping the whole packet and seeing where it differs from RHL62, but I don't know if anything really useful could be found out.. FWIW, I can ping solaris boxes just fine off my RHL7.
It is the 2.2.16-22 kernel. iputils on the RHL6.2 box is not updated to errata status. Since it's our fax server, I can't use it for testing purposes. There is another RHL7 box on our network which is UPGRADED from RHL6.2. From that machine I can ping all networks. It has another network card. Pinging Solaris machines is not the problem. The problem is ping/telnet/rsh to machines on other subnets.
Running out of ideas :-/. Replacing hardware (esp. network card, preferably with one with different drivers), checking configuration (speed, duplexity; shouldn't be a problem since connections work locally) or trying a different switch ports might help.
I have replaced the NIC with an old 3Com 3c900 10Mbps (PCI). I can now connect (and ping) to all our networks. Seems like a problem with RHL7 and the NIC using the eepro 100 driver. Can you recommend any 100Mbps PCI NICs for use with RHL7 (not using the eepro100 driver)?
3COM's and tulips have done the job well for me, as well as Inter EtherExpress Pro's. I haven't noticed any problems with eepro100 + RHL7, myself.
It has to be something else than the NIC. The 3Com NIC worked for a while, but now I have the same problem with the 3Com as I had with the eepro100. Unable to ping (or telnet/rsh) machines on the x.y.211.0 network. ping -R is working as before... Even if I reboot the machine, I'm not able to ping other networks...
FWIW, I've had very weird problems with NIC's once. The BIOS had some power management options on and the card put itself in some weird more after a day. This happened twice. Shutting the system down and pulling the power plug for a few minutes (cold restart) helped a bit. I don't know what it was but turning off all power management and changing the NIC helped. It was a 3c905, I think. I doubt this is bugging you, but you never know... HTH.
I switched to another port on our switch and now everything is working. The problem with the troublesome port will be handed over to our Cisco dealer.