Bug 15223
Summary: | ping gives invalid timings | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Neil Darlow <neil> | ||||
Component: | iputils | Assignee: | Crutcher Dunnavant <crutcher> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.0 | CC: | alan, pekkas, twaugh | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | Winston gold | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2000-08-05 17:40:50 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Neil Darlow
2000-08-03 12:20:49 UTC
I can't reproduce this here, but I bet the problem could be in ping.c lines with: delta = delta*1000 + (now.tv_usec - prev.tv_usec)/1000; iputils-20000121-1 from Beta1 works ok. iputils-20000418-1 from Beta2 and later display the fault. This defect is considered MUST-FIX for Winston Gold-release ... output is misleading enough this should be fixed I can't reproduce this problem, so there's more than just the information given above that's gonna be needed. Suggestions? I was able to reproduce this on a RH6.2 w/ latest iputils package rebuilt. I pinged localhost. The key was that it has to be 486. Pentiums and later work fine; 386 is probably also affected. This happens with and without Redhat patches. [the "normal" response on 486/66 without patches.. see jump to 2300 usec below] [root@linux SOURCES]# ping localhost PING localhost (127.0.0.1) from 127.0.0.1 : 56(84) bytes of data. 64 bytes from localhost (127.0.0.1): icmp_seq=0 ttl=255 time=-1623 usec 64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=255 time=-405 usec 64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=255 time=-440 usec 64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=255 time=-361 usec With small investigatory patch as follows, I could see some interesting results: --- iputils/ping.c Tue Apr 18 20:42:50 2000 +++ ip/ping.c Fri Aug 4 23:45:47 2000 @@ -765,11 +765,15 @@ struct timeval now; long delta = 0; int waittime; + + long diff=0; gettimeofday(&now, NULL); if (signo) { delta = now.tv_sec - prev.tv_sec; + diff=now.tv_usec - prev.tv_usec; +printf("\nDiff: %lu, now: %lu, prev: %lu\n", diff, now.tv_usec, prev.tv_usec); delta = delta*1000 + (now.tv_usec - prev.tv_usec)/1000; if (delta >= (interval-slop)) signo = 0; ----- With that, interesting things seem to be happening. Also note that latency increased from 330 usec to 2.3 msec. I guess the algorithm isn't too reliable for low end systems.. ---- [root@linux iputils]# ./ping localhost PING localhost (127.0.0.1) from 127.0.0.1 : 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=-5518 usec Diff: 4294962156, now: 122552, prev: 127692 64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=-2327 usec Diff: 31, now: 122583, prev: 122552 64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=-2401 usec Diff: 4294967267, now: 122554, prev: 122583 64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=-2243 usec ---- While on K6-2/300 the output is: ---- PING localhost (127.0.0.1) from 127.0.0.1 : 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=137 usec Diff: 4294960853, now: 882392, prev: 888835 64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=92 usec Diff: 4294967287, now: 882383, prev: 882392 64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=90 usec ---- Err, mixups with long/unsigned long, but that should still be of some help :) This smells like a kernel problem, but could be a gcc problem. Opinions? Hard to tell. I tried i386 kernel 2.2.16-17 on P2, works fine (Pinstripe) i586 kernel 2.2.16 (stock) on K6-2, works fine (RH6.2) i386 kernel 2.2.16-3 on 486/66, failure (RH6.2) HTH. You said 486 works. Does your 486 have TSC or not (cat /proc/cpuinfo) Could be related to timing accuracy and behaviour of tsc versus usec counter accuracy Err, 486 does not specifically _work_. It was just the only platform I could reproduce this on. I don't have access to any other 486's or 386's, or other oldish systems. /proc/cpuinfo doesn't tell much (no flags, lots of unknown :)..: ----- [root@linux /root]# more /proc/cpuinfo processor : 0 vendor_id : unknown cpu family : 4 model : 0 model name : unknown stepping : unknown fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : -1 wp : yes flags : bogomips : 33.08 ----- Linux version 2.2.16-3 (root.redhat.com) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #1 Mon Jun 19 18:10:14 EDT 2000 Do code changes between iputils-20000121-1 and iputils-20000418-1 give any clues? The contents of my /proc/cpuinfo are: processor : 0 vendor_id : AuthenticAMD cpu family : 4 model : 9 model name : 486 DX/4-WB stepping : 4 fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu bogomips : 60.01 I'll also take a look at the source code. I've tracked this down to SIOCGSTAMP returning wrong information. Created attachment 2094 [details]
Here is a patch that works around the problem.
SIOCGSTAMP was returning a time that was slightly behind the time that gettimeofday returned before sendmsg. This is the _real_ bug. I don't know why it happens; I can reproduce it 100% of the time on my 486: $ grep flags /proc/cpuinfo flags : fpu vme Seems like a kernel bug to me. I take it back. Looking at the kernel code I don't see a way to fix it in the kernel. Jeff, what do you think of the patch? Too icky? SIOCGSTAMP is low precision. If you want to use it with gettimeofday use gettimeofday just on the receive as well. This would also explain why only a non TSC chip sees the problem (There is a -U option to turn on this [older] behaviour, which the above patch turns off if it detets thsi situation.) Fixed in iputils-20000418-5. |