Bug 65106
Summary: | kernel experiences strange clock glitches | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Eric Seppanen <eds> | ||||
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.3 | CC: | dedourek, redhat | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | athlon | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2004-09-30 15:39:35 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Eric Seppanen
2002-05-17 17:07:28 UTC
Interesting behavior; time to read through all the changes again to search for something that could cause this ok can you try to get the output of rpm -q --qf %{arch} kernel-2.4.18-3 and rpm -q --qf %{arch} kernel-2.4.18-4 and see if the output is different ? (note the output will not add a "newline" so the prompt will come on the same line) Both are athlon packages. [eds@kong eds]$ rpm -q --qf %{arch} kernel-2.4.18-3 athlon[eds@kong eds]$ rpm -q --qf %{arch} kernel-2.4.18-4 athlon Check out this linux-kernel thread, this sound like _exactly_ what I'm experiencing: http://www.uwsg.iu.edu/hypermail/linux/kernel/0203.3/0557.html I've been banging on this system a bit and have a bit more info: 1. It also happens (but is a little harder to trigger) on the shipping kernel (2.4.18-3). 2. It's definitely not a userspace bug. I've added a small bit of logging code to the kernel and I can see that there's a problem where do_gettimeofday goes nuts and returns strange values. I'm proceeding on the assumption that this is a via chipset hardware bug that the kernel will need to work around. Mail me offline if you want to see the test code that I'm working with. 3. It seems to be triggered by IO, processor, or memory load. My current method of reproducing it is to play back an MPEG file in Xine, and then create a new 5000x5000 image in Gimp. 4. Once triggered, it's suddenly very persistent. It recurs every few seconds once I've reproduced, with symptoms so severe (i.e. screensaver kicks on every few seconds) that I need to reboot. 5. I have seen several reports of similar symptoms on linux-kernel in the last 2-3 months. The Via chipset seems to be the common case. Most generate little discussion, and though there are a few patches floating around, most are variations on some code already present in 2.4.18 (that does not fix the problem). I'm going to continue to work on this problem. I fear there could be a good number of machines out there that might be affected (I have three myself). Are you running NTP by chance ? I do not run NTP while I'm debugging this problem. I've banged on it enough to see that there's something haywire in the kernel, but I'm unable to nail it down. The timer interrupt is still alive, but there's something wrong with the kernel's ability to figure out the time offset since the last interrupt. I still think there's some sort of via-chipset hardware issue involved, because reinitializing the timer is necessary in order to get the system working correctly after the problem has occurred. I posted more details (and a patch) to linux-kernel: http://www.uwsg.indiana.edu/hypermail/linux/kernel/0205.2/1405.html At the risk of sounding like "me too".... me too! I see the same jump (71 minutes into the future), but unfortunately have different hardware. PC Chips M748LMRT Motherboard, completely SiS based 300 MHz Celeron 3Com 905 PCI Nic I have checked and the hardware clock works fine, but the Linux clock jumps around. I've tried RH 7.3 and 8.0 with there respective default kernels and newest errata kernels. FreeBSD 5.0RC2 seems to be immune (so maybe some code comparison can be made?) Is it possible that this bug is related to bug 76499. Note that there (76499) the time jumps are NOT always 71 minutes. Created attachment 89397 [details]
Output of date command
Output of
$ while (true); do date; done
In my case it is not like bug 76499. The clock JUMPS, it doesn't drift. Then Jumps back (I do have ntpd running, though I believe the behavior is the same either way). See attachement for an example. It is also weird because the Gnome clock will start reading 71 minutes fast, but be right when I bring up the properties, then jump forward after closing the properties. Another update: I've been running the Phoebe2 beta on this hardware for the past couple of days and the problem seems to be fixed. The standard 2.4.20 kernel (with RH 7.3) had the problem, but the 2.4.20 kernel with the beta has it fixed. Any ideas what changed that might have fixed it? Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/ |