Bug 168255 - powernow clock instability: warning: many lost ticks.
powernow clock instability: warning: many lost ticks.
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Brian Maly
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-13 20:35 EDT by Christopher P Johnson
Modified: 2007-11-30 17:07 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-29 10:29:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
program to print time jumps (909 bytes, text/plain)
2005-09-13 20:35 EDT, Christopher P Johnson
no flags Details
Results showing time jumps (1.20 KB, text/plain)
2005-09-13 20:37 EDT, Christopher P Johnson
no flags Details

  None (edit)
Description Christopher P Johnson 2005-09-13 20:35:21 EDT
Description of problem:

When enabling AMD PowerNow on x86-64 platforms w/out bios HPET timer
support, clock jumps forward and back several seconds or more.

Presumably clock code isn't prepared for cpu frequency changes.

Version-Release number of selected component (if applicable):

rhel4, rhel4-u1 x86-64

How reproducible:

Enable PowerNow. Run the system with variable load/and idle. clock will
begin to jump forward and back (symptoms: Warning: many ticks lost
message, screen saver starting after a few seconds, scsi timeout
messages). See attached program which will print large time jumps.

System is a Sun Fire X4100 (x86-64 system which does not provide bios
HPET support. Note that the problem appears masked on systems with
HPET).
  
Actual results:

Bad time.

Expected results:

System time should advance smoothly.

Additional info:
Comment 1 Christopher P Johnson 2005-09-13 20:35:21 EDT
Created attachment 118780 [details]
program to print time jumps
Comment 2 Christopher P Johnson 2005-09-13 20:37:15 EDT
Created attachment 118781 [details]
Results showing time jumps
Comment 3 Andrius Benokraitis 2005-09-14 00:14:50 EDT
This is a known issue and a fix has been proposed for (late) inclusion in RHEL4
U2. It is not known if this fix is confirmed to go into RHEL4 U2 or U3 due to
testing and QA timelines. Are you willing to assist in testing a beta kernel if
possible?
Comment 4 Andrius Benokraitis 2005-09-14 00:55:35 EDT

*** This bug has been marked as a duplicate of 158847 ***
Comment 5 Milan Kerslager 2005-09-23 05:06:41 EDT
This bug is marked as duplicate of private bug #158847 so reopenning this one.

I have BIOS only with AMD Colin'n'Quiet and this make no sense to lost tiks.
I've got these messages from the kernel (just for search):

Losing some ticks... checking if CPU frequency changed.
...
warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts
rip default_idle+0x20/0x23

I'm able to test beta kernel... I'll try to use the one from U2 Beta channel. Or
is there another kernel in place for testing?
Comment 6 Jason Baron 2005-09-23 13:52:21 EDT
no that's the one.
Comment 7 Brett Morrow 2005-09-28 15:20:04 EDT
I have the same problem, and I do have the option to turn off PowerNow, but that
does not help any.  I still get the problem and the errors.  

(With the released WS4 and Beta WS4)
Comment 8 Brian Maly 2005-09-28 15:26:03 EDT
which timesource is being used? (PMTimer, TSC, PIT)

do a "dmesg | grep time.c"

PMTimer is the preferred timekeeing mechanism on AMD systems. 
Comment 9 Milan Kerslager 2005-09-28 17:15:12 EDT
Here: time.c: Using PIT/TSC based timekeeping (2.6.9-11.ELsmp, x86_64 kernel).
Comment 10 Brett Morrow 2005-09-28 17:54:23 EDT
dmesg | grep time.c
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 2600.048 MHz processor.

uname -a
Linux ori.protect.nssl 2.6.9-17.EL #1 Fri Aug 26 10:54:28 EDT 2005 x86_64 x86_64
x86_64 GNU/Linux
Comment 11 Milan Kerslager 2005-09-29 05:47:41 EDT
With 2.6.9-17.ELsmp it seems to stop loosing the ticks. It seems to solve the
problem completly here (kernel-smp-2.6.9-17.EL.x86_64 still detect PIT/TSC based
timekeeping). Thank you!
Comment 12 Brian Maly 2005-09-29 10:29:50 EDT
Its worth mentioning that dual core SMP powernow support made it into the U2
kernel rather late. This included a PowernNow driver update for dual core SMP
support, as well as a handfull of timer fixes for PMtimer, HPET, TSC and PIT.
You likely need a 2.6.9-22.EL kernel to solve all powernow and/or timekeeping
issues. The 2.6.9-17.EL did not include these timer mods. Also, the timer code
works differently using SMP. 

BTW, feedback regarding the 2.6.9-22.EL kernel (and if it does or does not solve
the problem) would be of help.

Comment 13 Brett Morrow 2005-09-29 13:39:49 EDT
Running kernel 2.6.9-22.ELsmp with powernow turned on.  I still get the messages:

kernel: warning: many lost ticks.
Sep 29 12:33:37 ori kernel: Your time source seems to be instable or some driver
is hogging interupts
Sep 29 12:33:37 ori kernel: rip __do_softirq+0x4d/0xd0


(although, not so frequent.)

Comment 14 Brian Maly 2005-09-29 14:11:47 EDT
does the system keep good time with the .22 kernel regardless of the "lost tick"
message?

Usually the "many lost tick" message is a symptom of another problem. This
message is thrown when the linux kernel corrects for lost ticks (which is what
you want to happen). Point being, an occasional "many lost tick" message is
probably not much to worry about, but if this problem re-occurs very often its a
concern.
Comment 15 Milan Kerslager 2005-09-29 16:47:24 EDT
Well. I have no access to .22 kernel. Even that I have disabled Colin'n'Quiet
(IIRC the name) for all the time and there is no PowerNow setting in the BIOS.
And .17 kernel completly avoid 'lost ticks' messages in my kernel log (even I
have no load on this machine in the current time as its placement has been
postponed due the bugs) and .11 has always the problem (a minute or so after the
boot there is messages about lost ticks in the kernel log). I have no physical
access to the machine at the present time :-(

I'm waiting for U2 but if I'll have an access to testing .22, I'm able to
reboot&test (i'll try to generate some load for .17 though).

AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ on 2.6.9-17.ELsmp x86_64, ASUS A8N-E.
Comment 16 Brett Morrow 2005-09-29 17:26:33 EDT
Yes, the system does keep the proper time.  I wanted to make sure by giving it
several hours to go wrong if it was going to happen.  

Here is the link to the kernel I am using:

http://people.redhat.com/~jbaron/rhel4/
Comment 17 Milan Kerslager 2005-09-30 05:12:39 EDT
After the move from .17kernel to the .22 kernel from URL above, the PIT/TSC
based timekeeping has changed to PM based. No lost ticks yet so all went ok
since .17 here. The bug could be closed now or just after the U2 release (how
long will this take?).

time.c: Using PM based timekeeping.

Note You need to log in before you can comment on or make changes to this bug.