Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 8538

Summary:	All linux machine failed to arp caused by one machine
Product:	[Retired] Red Hat Linux	Reporter:	Peter Liu <pliu>
Component:	net-tools	Assignee:	Crutcher Dunnavant <crutcher>
Status:	CLOSED WORKSFORME	QA Contact:
Severity:	high	Docs Contact:
Priority:	medium
Version:	6.1
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2000-05-15 11:33:37 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Peter Liu 2000-01-17 15:55:20 UTC

We have a Machine running Red Hat Linux 6.1, it's primary purpose is for
DNS, it has been running great for the last 3 month.  It's running on a
P2-200 and 64 MB, 3Com 3c509 10bT.

Last night something happened, I was unable to access the network at all.
The first thing we noticed was that the DNS is down.  I was able to ping
the DNS machine but I can't telnet or HTTP into it.  I logged in to the
console as root and tried to "arp" and the computer just hangs.  I
control-C out and tried "ifdown eth0" then "ifup eth0" that doesn't seems
to fix the problem.  If that's all it is, then that's not a big deal, but I
was unable to run "arp" on the Linux on my workstation.  The "arp" on NT
systems runs just fine.  I reboot the DNS machine and everything was fine
after that.  After I rebooted our DNS, my wokstation was able to "arp"
agian.

What I don't understand is why if the DNS mahcine fails to "arp" it would
cause other Linux machines to fail to "arp".  The fact that we can't "arp"
could mean something more serious such as something failed in the network
protocol and it was sending out bad packets.  I'm not sure.

Could it be a bad nic card?  We also have a Watch Guard firewall box that
but that has been running for some time now.  Could there be a bug in the
kernel that has already been fixed.  I would really like to know if other
people are having this problem, I searched all over the web and couldn't
find it.

Thank you.

Comment 1 Peter Liu 2000-01-17 16:11:59 UTC

By the way, I looked at /var/log/messages and /var/log/security and was not able
to find any useful.  There only entry that might be worth mentioning is:

Jan 16 23:31:51 dns telnetd[3655]: ttloop:  peer died: Invalid or incomplete
multibyte or wide character
Jan 17 09:09:46 dns telnetd[3886]: ttloop:  peer died: Invalid or incomplete
multibyte or wide character

Both of these lines are next to each other on /var/log/messages

Comment 2 Jeff Johnson 2000-01-17 19:29:59 UTC

The ARP protocol (as opposed to the arp command) must be functioning or
you would not be able to ping your DNS box. The arp command relies on
the ability to dynamically load modules AFAIR, so you migh want to look
at what modules were loaded using lsmod if your problem re-occurs.

Sanity: If your DNS server was unavailable, many client commands/services
would require a long time, as host resolution time-outs are painfully
long. If that's the problem, you should think about setting up a secondary
name server and configuring your clients to use a 2nd server. FWIW, I usually
find that a 2nd name server is often not worth the effort since the time-out
on a single server is unacceptably slow, and adding a 2nd server just doubles
the time before a command/service on a client machine gets around to printing
an error message. Far easier to reboot the primary ...

As for the telnetd messages, you might want to try the latest telnet*0.16*
rpm's (there's two packages now, since the server/client have been separated).
The message appears (I can't find the exact string) to be due to a read
with a return code <= 0 which shouldn't be a problem (and is probably
only indirectly related to the other problem). Just guessing ...

Comment 3 Jeff Johnson 2000-05-15 11:33:59 UTC

This problem appears to be resolved. Please reopen if I'm wrong.