Bug 203374

Summary: gettimeofday goes backward randomly with or without ntpd
Product: Red Hat Enterprise Linux 4 Reporter: Prarit Bhargava <prarit>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron, k.georgiou, prarit, zing
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-12-19 15:39:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 174390    
Bug Blocks:    
Attachments:
Description Flags
Patch to disable C1 clock ramping
none
New patch to fix this issue. none

Comment 1 Prarit Bhargava 2006-08-21 15:25:00 UTC
I'll try disabling the C1 clock and see what happens.

Odd that this issue is only on x86_64 and is not seen on 32 bit.

P.

Comment 2 Prarit Bhargava 2006-08-22 13:09:45 UTC
Okay -- operator error here.  I accidentally used a different test that didn't
zero out the variables prior to using them -- which caused an initial error
message to be output.  Ooops.

I'll re-run with the correct test.

P.

Comment 3 Kostas Georgiou 2006-08-23 16:07:42 UTC
I tried notsc clock=pmtmr etc with no effect (which makes sense because of
#203235 ). The question is "is it a bug that gettimeofday goes backwards when
you use tsc or not"? 

Comment 4 Prarit Bhargava 2006-08-23 16:50:22 UTC
It is a bug when getttimeofday goes backwards when you use the tsc.

I'm running on an AMD RevF chpiset, and cannot reproduce this on an x86_64 kernel.

Kostas, could you provide me with the EXACT cmdline that you're booting with?

P.

Comment 5 Kostas Georgiou 2006-08-23 17:37:23 UTC
Here is what's in grub.conf

title Red Hat Enterprise Linux WS (2.6.9-42.ELsmp)
        root (hd0,0)
        kernel /vmlinuz-2.6.9-42.ELsmp ro root=/dev/rootvg/rootvol notsc
        initrd /initrd-2.6.9-42.ELsmp.img

Keep in mind that the clocks take a while (a few hours) to get out of sync
after a reboot.

Comment 6 Prarit Bhargava 2006-08-24 15:03:02 UTC
Created attachment 134824 [details]
Patch to disable C1 clock ramping

Kostas,

I can't seem to reproduce this issue on x86_64 RHEL4U4, with notsc.  But I do
have a theory as to what the issue is.	Could you apply the following patch and
run your test again?

Thanks,

P.

Comment 7 Kostas Georgiou 2006-08-26 17:20:37 UTC
Prarit, 

Remember that in the motherboard that I am testing notsc doesn't do anything
since tsc is the only option bug #203235. 

I'm running a test kernel (with the "if ((eax & CPUID_XMOD) >=
CPUID_XMOD_REV_F)" check disabled and it seems that has fixed the problem

The ntp drift in /var/lib/ntp/drift is down to 0.898 from -500 (the maximum) the
kernel doesn't print "Losing some ticks" any more and gettimeofday seems fine so
far.

I'll try the patch in some of the opteron machines (which the use HPET) to see
if the following message will dissapear from there as well (the clock is stable
there though).
  Losing some ticks... checking if CPU frequency changed.
  warning: many lost ticks.
  Your time source seems to be instable or some driver is hogging interupts


Comment 8 Prarit Bhargava 2006-08-28 17:31:11 UTC
>I'm running a test kernel (with the "if ((eax & CPUID_XMOD) >=
CPUID_XMOD_REV_F)" check disabled and it seems that has fixed the problem

Er ... I'm confused?  You commented out the patch's CPUID check and your problem
was solved?

P.

Comment 9 Prarit Bhargava 2006-08-28 17:32:46 UTC
Created attachment 135062 [details]
New patch to fix this issue.

Comment 10 Prarit Bhargava 2006-08-28 17:34:02 UTC
Kostas,

I've uploaded a new patch into this BZ.  Would you mind testing with that?

There was an issue with the way C1 clock ramping was handled in the first patch
I sent you.

Thanks :)

P.

Comment 11 Kostas Georgiou 2006-10-20 12:35:13 UTC
about comment #8 
What I am saying is that the patch worked but I had to dropped the  "if ((eax &
CPUID_XMOD) >= CPUID_XMOD_REV_F)" because I am not using a rev F opteron :)

Here is the relevant info from /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 43
model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
stepping        : 1

Unfortunately the patch doesn't apply cleanly and I didn't had the time yet to
check what changed since I've been quite busy :(. I'll try to merge it to
kernel-smp-2.6.9-42.0.3 next week and I'll let you know. I will have to take out
the revision check since I have no idea what I should be using to include my cpu
 in it.

Comment 12 Prarit Bhargava 2006-10-23 13:43:35 UTC
(In reply to comment #11)
> Unfortunately the patch doesn't apply cleanly and I didn't had the time yet to
> check what changed since I've been quite busy :(. I'll try to merge it to
> kernel-smp-2.6.9-42.0.3 next week and I'll let you know. I will have to take out
> the revision check since I have no idea what I should be using to include my cpu
>  in it.

Ah ... that should be okay.  We've seen Rev E's and Rev F's with the problem.

P.

Comment 16 Prarit Bhargava 2006-12-19 15:39:05 UTC

*** This bug has been marked as a duplicate of 196868 ***