Bug 174390
Summary: | gettimeofday goes backward randomly with or without ntpd | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | claudiu pop <claudiu.pop> |
Component: | kernel | Assignee: | Brian Maly <bmaly> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | jbaron, k.georgiou, prarit, zing |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-08-18 18:47:50 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 203374 |
Description
claudiu pop
2005-11-28 19:10:57 UTC
*** Bug 174401 has been marked as a duplicate of this bug. *** *** Bug 174401 has been marked as a duplicate of this bug. *** Probably your program is bouncing between CPUs and each core is a few seconds different. I've seen that before with BIOS bugs. If you do a "while true ; do date ; done | uniq -c" does it go back and forth in time? If you have schedutils installed try setting the process affinity with something like: taskset -c 0 bash -c "while true ; do date ; done | uniq -c" Comment #3 would only be true of TSC is being used for timekeeping. TSC's between processors are not syncronized on AMD based systems. As a result the kernel might read from different TSC's each time and introduce error into the timekeeping math. Use "notsc" at boot to prevent this problem. This is a known issue that will be fixed in future releases. Even with notsc clock=pmtmr I get time.c: Using 1.193182 MHz PIT timer. time.c: Detected 2002.665 MHz processor. time.c: Using PIT/TSC based timekeeping. this even with the 2.6.9-42.ELsmp kernel Then again I am running at x86_64 so notsc might be working under i686. Re: Comment #5, Does the system keep time properly with "notsc clock=pmtmr" on x86_64? i686 has different code for timekeeping anyway. No it doesn't :( I am getting messages like this from ntp ntpd: time reset -0.828991 s for a while and then ntpd: frequency error -512 PPM exceeds tolerance 500 PPM Running "while true ; do date ; done | uniq -c" also shows that time sometimes goes backward :( Is there a similar patch to the one from comment 12 in #172199 for x86_64? I'll try to have a look at the code during the weekend. Should I add a new bug for x86_64? Anyway from what I can see at the code (and I know almost nothing about the kernel) in arch/x86_64/kernel/time.c I can't see anywhere that pmtmr_ioport gets initialized unless arch/i386/kernel/acpi/boot.c is also used under x86_64 which I find unlikely. So if there is no hpet timer we end up in tsc :( Well arch/i386/kernel/acpi/boot.c is used so it seems that somehow pm-timer doesn't get detected. The same motherboard at home A8V Deluxe (with the beta bios) detects the pm-timer under Fedora5 so I'll upgrade the bios in one of the machines at work to see if that helps or it is a problem with the detection code under RHEL4. i686 timekeeping code only supports TSC, which wont work properly on multiprocessor AMD machines. a patch is currently being tested which disables C1 clock ramping so the system can use TSC safely To avoid this issue, use "notsc" at boot on x86_64 to disable TSC based timekeeping. On i386 (i686), use "clock=hpet", "clock=pmtmr" or "clock=pit" to select a different timesource. notsc doesn't do anything for the A8V Deluxe under x86_64 since the system doesn't have HPET and pm doesn't get detected the system still falls back to tsc :( Re: Comment #13, looks like we will need to disable C1 clock ramping so TSC can be used on i386. RedHat is working on a patch for this. What chipset does the A8V Deluxe have? Seems like it should probably have a PMTimer... maybe thats a seperate issue worth filing a new and seperate BugZilla? (In reply to comment #14) > Re: Comment #13, looks like we will need to disable C1 clock ramping so TSC can > be used on i386. RedHat is working on a patch for this. > > What chipset does the A8V Deluxe have? Seems like it should probably have a > PMTimer... maybe thats a seperate issue worth filing a new and seperate BugZilla? If this can help, I retried the very same test on a different version of RHAT >cat /etc/redhat-release Red Hat Enterprise Linux WS release 4 (Nahant Update 3) Linux version 2.6.9-34.ELsmp (bhcompile.redhat.com) (gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)) #1 SMP Fri Feb 24 16:54:53 EST 2006 and my little program does no longer produce incorrect results. Besides, in dmesg logs I see now this line: checking TSC synchronization across 4 CPUs: passed. So I guess that this means that synchronisations issues between the 4 cores have been solved. Re Comment #15 There were a few changes that went into RHEL4 U3 and U4.. One to check TSC synchronization and another to move TSC to the last picking order of timers. Are you using i686 or x86_64? Re Comment #15 There were a few changes that went into RHEL4 U3 and U4.. One to check TSC synchronization and another to move TSC to the last picking order of timers. Are you using i686 or x86_64? (In reply to comment #17) > Re Comment #15 > > There were a few changes that went into RHEL4 U3 and U4.. One to check TSC > synchronization and another to move TSC to the last picking order of timers. > > Are you using i686 or x86_64? i686 rigth now but I'm pretty sure that among many other tests I also ran this one on x86_64 some time ago and it worked. Can the reporter try to reproduce this issue with the newest kernel (RHEL4.4)? We believe this is a regression that was fixed in the last release. Redhat is unable to reproduce this issue with the current U4 kernel. Comment #14 Yes the non detection of PM-Timer should be a different bug report, I'll add one shortly. Testing again the system with a RHEl4.4 userland as well as kernel seems to have fixed the "time goes backward" problem as far as I can tell. I remember reading about changes in the way gibc calls gettimeofday not that I think about it. I suspect that the bug can be closed if Claudiu doesn't see any problems. (In reply to comment #23) > Comment #14 > > Yes the non detection of PM-Timer should be a different bug report, I'll add one > shortly. Testing again the system with a RHEl4.4 userland as well as kernel > seems to have fixed the "time goes backward" problem as far as I can tell. I > remember reading about changes in the way gibc calls gettimeofday not that I > think about it. > > I suspect that the bug can be closed if Claudiu doesn't see any problems. I have no objections to close this case. (I can't test RHEL4.4 but I'll do it as soon as I can). closing this bug since because was fixed in a previous release Hmm after a few hours running RHEL4.4 x86_64: $ while true ; do date ; done | uniq -c 318 Sat Aug 19 18:46:54 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 2 Sat Aug 19 18:46:54 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 2 Sat Aug 19 18:46:54 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 2 Sat Aug 19 18:46:54 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:54 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:54 BST 2006 312 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:56 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:56 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:56 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:56 BST 2006 1 Sat Aug 19 18:46:55 BST 2006 1 Sat Aug 19 18:46:56 BST 2006 I added #203236 for the non detection of pmtimer in the Asus A8V Deluxe mb. Okay -- this bug is against i686 ... I'll test against x86_64. Could you send me the command line you used to boot? P. |