Bug 168166 - ping prints nonsensical round-trip times on AMD64
Summary: ping prints nonsensical round-trip times on AMD64
Alias: None
Product: Fedora
Classification: Fedora
Component: iputils
Version: rawhide
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: Radek Vokal
QA Contact: Mike McLean
Depends On:
TreeView+ depends on / blocked
Reported: 2005-09-13 04:28 UTC by Alexandre Oliva
Modified: 2007-11-30 22:11 UTC (History)
1 user (show)

Clone Of:
Last Closed: 2005-09-30 08:22:42 UTC

Attachments (Terms of Use)
memset message structure before using it (407 bytes, patch)
2005-09-22 19:25 UTC, Bill Nottingham
no flags Details | Diff

Description Alexandre Oliva 2005-09-13 04:28:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8b3) Gecko/20050827 Fedora/1.1-0.2.8.deerpark.alpha2 Firefox/1.0+

Description of problem:
ping has recently started displaying nonsensical ping times.  Pinging a server that's 20-30ms away, network-wise, will often print the right round-trip time for a number of iterations, but then it will start printing times in the range of 4295000ms.  Sometimes it starts with the incorrect times already; sometimes it takes more than 60 seconds to begin corrupting (?) the time output.  I've left ping running on an i686 box for several minutes and it never experience the problem, but this AMD64 box has never failed to display it after some time.  The same box running FC4 doesn't experience the problem, so it's certainly something in the software.

I've never seen ping recover once it begins printing these odd numbers, except once while running ping under strace, as root.

Here's partial output from `ping emilia' on the AMD64 box:
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=21 ttl=52 time=23.6 ms
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=22 ttl=52 time=25.3 ms
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=23 ttl=52 time=25.3 ms
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=24 ttl=52 time=4294992 ms
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=25 ttl=52 time=4294991 ms
64 bytes from emilia.lsd.ic.unicamp.br ( icmp_seq=26 ttl=52 time=4294992 ms

Heck, even ping localhost will display the problem, but it seems to be more difficult to trigger.  I've also experienced the problem with ping times of about 200ms.

Version-Release number of selected component (if applicable):
iputils-20020927-26 glibc-2.3.90-11 kernel-2.6.13-1.1549_FC5

How reproducible:

Steps to Reproduce:
1.run ping somehost, and leave it running for one or two minutes

Actual Results:  It will at some point start printing lines with round-trip times of 4295 seconds, even though the lines are still printed one second apart from each other, including those before the times get corrupted.

Expected Results:  Corrent round-trip times.

Additional info:

Comment 1 Radek Vokal 2005-09-13 07:35:17 UTC
I didn't manage to reproduce this issue on my AM64 box. Unfortunatelly I don't
have rawhide installed on it but just FC4 with the latest kernel. Can you please
retest ping on another kernel? 

Comment 2 Alexandre Oliva 2005-09-13 15:13:34 UTC
Err...  I wrote the problem wouldn't happen on FC4.  What is it that you want me
to retest, FC4 with rawhide kernel or rawhide with FC4 kernel?

Comment 3 Radek Vokal 2005-09-14 06:01:32 UTC
Aha, sorry, I've overlooked that you've already test it on FC4. I'll try to
update my x86_64 box later this week and reproduce this issue. Is it only this
specific kernel you see the above problem or is it happenning also with another
FC5 kernels? 

Comment 4 Alexandre Oliva 2005-09-14 14:55:58 UTC
I started noticing it about a week ago, and since we've pretty much had daily
kernel updates in rawhide, several different kernels have exposed the issue. 
That said, yesterday's kernel wouldn't boot on my Athlon64 notebook, so I
couldn't test that, so I'm still runing the kernel mentioned above.  At least
until today's rawhide kernel gets installed and a get a chance to reboot :-)

Comment 5 Bill Nottingham 2005-09-22 18:50:51 UTC
I see this as well on recent rawhide.

Comment 6 Bill Nottingham 2005-09-22 19:25:46 UTC
Created attachment 119163 [details]
memset message structure before using it

This fixes it for me. However, this does make me question why this never showed
up before. :)

Comment 7 Pekka Pietikäinen 2005-09-29 17:51:13 UTC
Interesting... I was seeing this too, and it appears quite random too. Only
64-bit is affected, 32-bit ping works a-ok for me. I also saw similar things
with tcpdump, it sometimes got timestamps in year 2142 or so. And sometimes the
timestamps were just ok, same with ping. 

The behaviour changed some time between 1535 and 1582 (anything between that
crash and burned all the time, so pinpointing the exact change is hard).

There seem to have been some timestamp-related changes in the kernel networking
code, but I think ping actually just uses a timestamp embedded in the packet
data to do the measurements?

Comment 8 Radek Vokal 2005-09-30 08:22:42 UTC
(In reply to comment #6)
> This fixes it for me. However, this does make me question why this never showed
> up before. :)

Thanks, patch checked in. 

Note You need to log in before you can comment on or make changes to this bug.