Bug 65106

Summary: kernel experiences strange clock glitches
Product: [Retired] Red Hat Linux Reporter: Eric Seppanen <eds>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: dedourek, redhat
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output of date command none

Description Eric Seppanen 2002-05-17 17:07:28 UTC
Description of Problem:

The update kernel 2.4.18-4 causes my machine to experience strange jumps in the
clock (yes, the computer thinks the time is jumping forward and backward) that
cause the following strange behaviors:
- the screen saver suddenly activates while I'm using the keyboard/mouse.
- video players such as Xine and Realplayer8 stutter and hang.
- the clock displayed on the X login screen hops around, displaying odd times,
as the username and password are typed (and these jumps often cause a timeout
where the entered info is erased).

If I boot up with the old kernel, 2.4.18-3, these problems go away.  Most of the
symptoms are intermittent but using Xine fails pretty much all the time.

This is an Athlon 900 on a Tyan motherboard (Via chipset). It's been running 7.2
stably for a long time.

Version-Release number of selected component (if applicable):
2.4.18-4.athlon

Comment 1 Arjan van de Ven 2002-05-21 09:54:12 UTC
Interesting behavior; time to read through all the changes again to search for 
something that could cause this

Comment 2 Arjan van de Ven 2002-05-21 10:06:02 UTC
ok can you try to get the output of
rpm -q --qf %{arch} kernel-2.4.18-3
and
rpm -q --qf %{arch} kernel-2.4.18-4

and see if the output is different ?
(note the output will not add a "newline" so the prompt will come on the same line)

Comment 3 Eric Seppanen 2002-05-21 16:57:27 UTC
Both are athlon packages.

[eds@kong eds]$ rpm -q --qf %{arch} kernel-2.4.18-3
athlon[eds@kong eds]$ rpm -q --qf %{arch} kernel-2.4.18-4
athlon

Comment 4 Eric Seppanen 2002-05-21 17:01:18 UTC
Check out this linux-kernel thread, this sound like _exactly_ what I'm experiencing:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0203.3/0557.html

Comment 5 Eric Seppanen 2002-05-22 02:21:08 UTC
I've been banging on this system a bit and have a bit more info:

1. It also happens (but is a little harder to trigger) on the shipping kernel
(2.4.18-3).

2. It's definitely not a userspace bug.  I've added a small bit of logging code
to the kernel and I can see that there's a problem where do_gettimeofday goes
nuts and returns strange values.  I'm proceeding on the assumption that this is
a via chipset hardware bug that the kernel will need to work around.  Mail me
offline if you want to see the test code that I'm working with.

3. It seems to be triggered by IO, processor, or memory load.  My current method
of reproducing it is to play back an MPEG file in Xine, and then create a new
5000x5000 image in Gimp.

4. Once triggered, it's suddenly very persistent.  It recurs every few seconds
once I've reproduced, with symptoms so severe (i.e. screensaver kicks on every
few seconds) that I need to reboot.

5. I have seen several reports of similar symptoms on linux-kernel in the last
2-3 months.  The Via chipset seems to be the common case.  Most generate little
discussion, and though there are a few patches floating around, most are
variations on some code already present in 2.4.18 (that does not fix the problem).

I'm going to continue to work on this problem.  I fear there could be a good
number of machines out there that might be affected (I have three myself).

Comment 6 Arjan van de Ven 2002-06-10 10:06:08 UTC
Are you running NTP by chance ?

Comment 7 Eric Seppanen 2002-06-10 16:48:47 UTC
I do not run NTP while I'm debugging this problem.  I've banged on it enough to
see that there's something haywire in the kernel, but I'm unable to nail it
down.  The timer interrupt is still alive, but there's something wrong with the
kernel's ability to figure out the time offset since the last interrupt.  I
still think there's some sort of via-chipset hardware issue involved, because
reinitializing the timer is necessary in order to get the system working
correctly after the problem has occurred.

I posted more details (and a patch) to linux-kernel:
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0205.2/1405.html

Comment 8 William Hooper 2003-01-14 23:53:07 UTC
At the risk of sounding like "me too".... me too!

I see the same jump (71 minutes into the future), but unfortunately have 
different hardware.

PC Chips M748LMRT Motherboard, completely SiS based
300 MHz Celeron
3Com 905 PCI Nic

I have checked and the hardware clock works fine, but the Linux clock jumps 
around.

I've tried RH 7.3 and 8.0 with there respective default kernels and newest 
errata kernels.

FreeBSD 5.0RC2 seems to be immune (so maybe some code comparison can be made?)

Comment 9 John DeDourek 2003-01-15 15:40:31 UTC
Is it possible that this bug is related to bug 76499.  Note that there
(76499) the time jumps are NOT always 71 minutes.

Comment 10 William Hooper 2003-01-15 23:32:04 UTC
Created attachment 89397 [details]
Output of date command

Output of

$ while (true); do date; done

Comment 11 William Hooper 2003-01-15 23:34:34 UTC
In my case it is not like bug 76499.  The clock JUMPS, it doesn't drift.  Then 
Jumps back (I do have ntpd running, though I believe the behavior is the same 
either way).  See attachement for an example.

It is also weird because the Gnome clock will start reading 71 minutes fast, 
but be right when I bring up the properties, then jump forward after closing 
the properties.

Comment 12 William Hooper 2003-01-25 20:51:52 UTC
Another update:

I've been running the Phoebe2 beta on this hardware for the past couple of days 
and the problem seems to be fixed.  The standard 2.4.20 kernel (with RH 7.3) 
had the problem, but the 2.4.20 kernel with the beta has it fixed.  Any ideas 
what changed that might have fixed it?

Comment 13 Bugzilla owner 2004-09-30 15:39:35 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/