Bug 168166

Summary: ping prints nonsensical round-trip times on AMD64
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: iputilsAssignee: Radek Vokál <rvokal>
Status: CLOSED RAWHIDE QA Contact: Mike McLean <mikem>
Severity: low Docs Contact:
Priority: medium    
Version: rawhideCC: notting
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 20020927-28 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-30 08:22:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
memset message structure before using it none

Description Alexandre Oliva 2005-09-13 04:28:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8b3) Gecko/20050827 Fedora/1.1-0.2.8.deerpark.alpha2 Firefox/1.0+

Description of problem:
ping has recently started displaying nonsensical ping times.  Pinging a server that's 20-30ms away, network-wise, will often print the right round-trip time for a number of iterations, but then it will start printing times in the range of 4295000ms.  Sometimes it starts with the incorrect times already; sometimes it takes more than 60 seconds to begin corrupting (?) the time output.  I've left ping running on an i686 box for several minutes and it never experience the problem, but this AMD64 box has never failed to display it after some time.  The same box running FC4 doesn't experience the problem, so it's certainly something in the software.

I've never seen ping recover once it begins printing these odd numbers, except once while running ping under strace, as root.

Here's partial output from `ping emilia' on the AMD64 box:
[...]
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=21 ttl=52 time=23.6 ms
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=22 ttl=52 time=25.3 ms
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=23 ttl=52 time=25.3 ms
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=24 ttl=52 time=4294992 ms
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=25 ttl=52 time=4294991 ms
64 bytes from emilia.lsd.ic.unicamp.br (143.106.24.129): icmp_seq=26 ttl=52 time=4294992 ms
[...]

Heck, even ping localhost will display the problem, but it seems to be more difficult to trigger.  I've also experienced the problem with ping times of about 200ms.

Version-Release number of selected component (if applicable):
iputils-20020927-26 glibc-2.3.90-11 kernel-2.6.13-1.1549_FC5

How reproducible:
Always

Steps to Reproduce:
1.run ping somehost, and leave it running for one or two minutes

Actual Results:  It will at some point start printing lines with round-trip times of 4295 seconds, even though the lines are still printed one second apart from each other, including those before the times get corrupted.

Expected Results:  Corrent round-trip times.

Additional info:

Comment 1 Radek Vokál 2005-09-13 07:35:17 UTC
I didn't manage to reproduce this issue on my AM64 box. Unfortunatelly I don't
have rawhide installed on it but just FC4 with the latest kernel. Can you please
retest ping on another kernel? 

Comment 2 Alexandre Oliva 2005-09-13 15:13:34 UTC
Err...  I wrote the problem wouldn't happen on FC4.  What is it that you want me
to retest, FC4 with rawhide kernel or rawhide with FC4 kernel?

Comment 3 Radek Vokál 2005-09-14 06:01:32 UTC
Aha, sorry, I've overlooked that you've already test it on FC4. I'll try to
update my x86_64 box later this week and reproduce this issue. Is it only this
specific kernel you see the above problem or is it happenning also with another
FC5 kernels? 

Comment 4 Alexandre Oliva 2005-09-14 14:55:58 UTC
I started noticing it about a week ago, and since we've pretty much had daily
kernel updates in rawhide, several different kernels have exposed the issue. 
That said, yesterday's kernel wouldn't boot on my Athlon64 notebook, so I
couldn't test that, so I'm still runing the kernel mentioned above.  At least
until today's rawhide kernel gets installed and a get a chance to reboot :-)

Comment 5 Bill Nottingham 2005-09-22 18:50:51 UTC
I see this as well on recent rawhide.

Comment 6 Bill Nottingham 2005-09-22 19:25:46 UTC
Created attachment 119163 [details]
memset message structure before using it

This fixes it for me. However, this does make me question why this never showed
up before. :)

Comment 7 Pekka Pietikäinen 2005-09-29 17:51:13 UTC
Interesting... I was seeing this too, and it appears quite random too. Only
64-bit is affected, 32-bit ping works a-ok for me. I also saw similar things
with tcpdump, it sometimes got timestamps in year 2142 or so. And sometimes the
timestamps were just ok, same with ping. 

The behaviour changed some time between 1535 and 1582 (anything between that
crash and burned all the time, so pinpointing the exact change is hard).

There seem to have been some timestamp-related changes in the kernel networking
code, but I think ping actually just uses a timestamp embedded in the packet
data to do the measurements?

Comment 8 Radek Vokál 2005-09-30 08:22:42 UTC
(In reply to comment #6)
> This fixes it for me. However, this does make me question why this never showed
> up before. :)

Thanks, patch checked in.