From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.8) Gecko/20061105 Red Hat/1.5.0.8-0.1.el4 Firefox/1.5.0.8 pango-text Description of problem: When I boot the xen kernel 2.6.18-1.2961.el5xen, there are many printk message console: Jan 17 18:59:27 localhost kernel: Timer ISR/0: Time went backwards: delta=-14251821 delta_cpu=69748179 shadow=14423492596895 off=785179739 processed=14424292000000 cpu_processed=14424208000000 Jan 17 18:59:27 localhost kernel: 0: 14424208000000 Jan 17 18:59:27 localhost kernel: 1: 14424292000000 Jan 17 18:59:32 localhost kernel: printk: 55 messages suppressed. Jan 17 18:59:32 localhost kernel: Timer ISR/1: Time went backwards: delta=-57603086 delta_cpu=82396914 shadow=14428863196453 off=423229749 processed=14429344000000 cpu_processed=14429204000000 Jan 17 18:59:32 localhost kernel: 0: 14429344000000 Jan 17 18:59:32 localhost kernel: 1: 14429204000000 These messages shows once in every seconds. And the system clock run about 3 times faster then normal. The same problem happened when I install FC6 xen kernel. My machines is a ASUS A8F series laptop, with T2060 1.6G Centrino Duo CPU. Version-Release number of selected component (if applicable): kernel-xen-2.6.18-1.2961.el5 How reproducible: Always Steps to Reproduce: 1. Install RHEL5 beta 2 2. Boot with xen kernel 3. Wait a few seconds Actual Results: There are many printk message console (the numbers are different every time): Jan 17 18:59:27 localhost kernel: Timer ISR/0: Time went backwards: delta=-14251821 delta_cpu=69748179 shadow=14423492596895 off=785179739 processed=14424292000000 cpu_processed=14424208000000 Jan 17 18:59:27 localhost kernel: 0: 14424208000000 Jan 17 18:59:27 localhost kernel: 1: 14424292000000 Jan 17 18:59:32 localhost kernel: printk: 55 messages suppressed. Jan 17 18:59:32 localhost kernel: Timer ISR/1: Time went backwards: delta=-57603086 delta_cpu=82396914 shadow=14428863196453 off=423229749 processed=14429344000000 cpu_processed=14429204000000 Jan 17 18:59:32 localhost kernel: 0: 14429344000000 Jan 17 18:59:32 localhost kernel: 1: 14429204000000 Expected Results: No such messages. And system clock run correctly. Additional info: May be similar to this: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=195
Created attachment 145779 [details] My sysreport It is the sysreport of my laptop. Don't know if it is useful.
Could you try the latest release (>3002) to see if it makes a difference.
I update the xen kernel to a nightly build 20070120. The kernel version kernel-xen-2.6.18-4.el5xen. The problem still exist. [root@localhost Server]# rpm -qpi kernel-xen-2.6.18-4.el5.i686.rpm warning: kernel-xen-2.6.18-4.el5.i686.rpm: Header V3 DSA signature: NOKEY, key ID 37017186 Name : kernel-xen Relocations: (not relocatable) Version : 2.6.18 Vendor: Red Hat, Inc. Release : 4.el5 Build Date: Thu 18 Jan 2007 12:50:41 PM HKT Install Date: (not installed) Build Host: ls20-bc1-14.build.redhat.com Group : System Environment/Kernel Source RPM: kernel-2.6.18-4.el5.src.rpm Size : 36768128 License: GPLv2 Signature : DSA/SHA1, Thu 18 Jan 2007 03:36:19 PM HKT, Key ID 5326810137017186 Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> Summary : The Linux kernel compiled for Xen VM operations Description : This package includes a version of the Linux kernel which runs in Xen VM. It works for both priviledged and unpriviledged guests. [root@localhost Server]# uname -a Linux localhost.localdomain 2.6.18-4.el5xen #1 SMP Wed Jan 17 23:31:02 EST 2007 i686 i686 i386 GNU/Linux
change QA contact
Seeing this on RHEL5.1-Server-20071017.0 (kernel-xen-2.6.18-53.el5, xen-3.0.3-41.el5) on an Intel Bensley (quad-core harpertown). [ cpu flags ] flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm [ /var/log/messages output ] Oct 23 16:09:24 localhost kernel: printk: 88 messages suppressed. Oct 23 16:09:24 localhost kernel: Timer ISR/1: Time went backwards: delta=-16292353 delta_cpu=11707647 shadow=12259248375707 off=431476910 processed=12259696000000 cpu_processed=12259668000000 Oct 23 16:09:24 localhost kernel: 0: 12259692000000 Oct 23 16:09:24 localhost kernel: 1: 12259668000000 Oct 23 16:09:24 localhost kernel: Timer ISR/1: Time went backwards: delta=-16968342 delta_cpu=11031658 shadow=12259248375707 off=502656161 processed=12259768000000 cpu_processed=12259740000000 Oct 23 16:09:24 localhost kernel: 0: 12259764000000 Oct 23 16:09:24 localhost kernel: 1: 12259740000000 Oct 23 16:09:24 localhost kernel: Timer ISR/1: Time went backwards: delta=-14239850 delta_cpu=13760150 shadow=12259248375707 off=601385025 processed=12259864000000 cpu_processed=12259836000000 Oct 23 16:09:24 localhost kernel: 0: 12259860000000 Oct 23 16:09:24 localhost kernel: 1: 12259836000000 Oct 23 16:09:24 localhost kernel: Timer ISR/1: Time went backwards: delta=-12237891 delta_cpu=11762109 shadow=12259248375707 off=611386633 processed=12259872000000 cpu_processed=12259848000000 Oct 23 16:09:24 localhost kernel: 0: 12259872000000 Oct 23 16:09:24 localhost kernel: 1: 12259848000000
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
OK. It looks like this problem still persists through to 5.1....any chance you can try out 5.2 on these boxes and see if there is any difference? There have been a few timer related fixes that went in, but I'm not sure if they will help or not. If 5.2 does *not* help, please get the output from "xm dmesg" and "dmesg" on the dom0, and put it here as attachments. Thanks, Chris Lalancette
This request was previously evaluated by Red Hat Product Management for inclusion in the current Red Hat Enterprise Linux release, but Red Hat was unable to resolve it in time. This request will be reviewed for a future Red Hat Enterprise Linux release.
Several timer fixes have been made in RHEL 5.2 and this problem should be fixed. Can you please try the -90 kernel to verify? Thanks.
I try -90 kernel. The result is interesting that the system clock run about 2 times SLOWER then normal this time. And similar messages are printed: Apr 22 23:21:39 localhost kernel: Warning Timer ISR/0: Time went backwards: delta=-25347748 delta_cpu=26652252 shadow=80668539637 off=278114507 processed=80972000000 cpu_processed=80920000000 Apr 22 23:21:39 localhost kernel: 0: 80920000000 Apr 22 23:21:39 localhost kernel: 1: 80972000000 Apr 22 23:21:44 localhost kernel: printk: 241 messages suppressed. Apr 22 23:21:44 localhost kernel: Warning Timer ISR/1: Time went backwards: delta=-16039717 delta_cpu=95960283 shadow=85611897209 off=576064630 processed=86204000000 cpu_processed=86092000000 Apr 22 23:21:44 localhost kernel: 0: 86204000000 Apr 22 23:21:44 localhost kernel: 1: 86092000000 Apr 22 23:21:50 localhost kernel: printk: 107 messages suppressed. FYI, I try using Fedora kernel 2.6.21-2957.fc8xen before but it suffered from the "system clock 3 times faster" problem. But recently I try 2.6.21.7-3.fc8xen and the problem seems fixed, the system clock run normally. In the boot process of the -90 kernel, I found this line: Apr 22 23:20:36 localhost kernel: Calibrating delay using timer specific routine.. 3991.91 BogoMIPS (lpj=7983834) while in the boot process of 2.6.21.7-3.fc8xen kernel, I found these 2 lines: Apr 22 23:23:16 localhost kernel: Calibrating delay using timer specific routine.. 3194.11 BogoMIPS (lpj=1597057) <snip> Apr 22 23:23:16 localhost kernel: Initializing CPU#1 Apr 22 23:23:16 localhost kernel: Calibrating delay using timer specific routine.. 3191.58 BogoMIPS (lpj=1595790) I am not sure if it is related this problem.
Oh yeah, that is a big problem; being 800MHz off will cause all kinds of havoc. And that might also be related to another problem we saw calibrating the timers. 2 questions: 1) Can you upload the full dmesg for the 2.6.18-90.el5xen kernel and the 2.6.21.7-3.fc8xen kernel, as well as the "xm dmesg" from both? 2) Do you know what the actual MHz of the machine is supposed to be? I'm assuming it should be around 3.2GHz, but confirming that might be good. You can either go by the hardware description, or boot a bare-metal (non-xen) kernel and get the output from /proc/cpuinfo Thanks, Chris Lalancette
Created attachment 303423 [details] cpuinfo from 2.6.23.1-42.fc8 non-xen kernel
Created attachment 303433 [details] dmesg and xm dmesg of 2.6.18-90.el5xen and 2.6.21.7-3.fc8xen kernel
Took another quick look at this. My guess is that the difference in behavior between the F-8 kernel and the RHEL-5 kernel has to do with the fact that the F-8 kernel you are using is choosing the ACPI PM Timer, while RHEL-5 is choosing the PIT timer. Now, at one point we tried to turn on the ACPI PM Timer in RHEL-5 as well, but we found that it caused some problems on certain boxes. It's something we'll want to investigate again, but until we fix those problems we can't turn it on in RHEL-5. Chris Lalancette
*** Bug 445363 has been marked as a duplicate of this bug. ***
Is there a workaround for this issue? I am seeing it on all of my 32-bit servers and it is creating an issue with Likewise Enterprise--the clock skew is preventing some Active Directory-related processes from operating correctly.
Another note of relevance... I am seeing this issue on 5.4.
(In reply to comment #23) > Is there a workaround for this issue? I am seeing it on all of my 32-bit > servers and it is creating an issue with Likewise Enterprise--the clock skew is > preventing some Active Directory-related processes from operating correctly. Hi Thomas, I'm sorry there's no work-around that I know of at this time. Since the time of the reporting of this bug (over 3 years ago), it's very rarely been seen. I've only seen it once and that was a machine that had flakey hardware. I saw lots of MCEs in /var/log/messages and when I ran the mcelog command periodically on a bare-metal kernel. Is your system behaving ok other than this problem? What exactly do you have for hardware? Andrew
Both servers are operating correctly otherwise, and both are Dell PowerEdge 2600s. I updated the BIOS and some other firmware on both to rule out problems there, the problem reoccurred on both after the updates. Interestingly, this issue doesn't occur when ntpd is disabled. As a workaround, I disabled ntpd and added a cronjob to run ntpdate every 4 hours--the time is never more than milliseconds off.
*** Bug 551153 has been marked as a duplicate of this bug. ***
Has an interested parties been able to reproduce this lately? If not, then I would like to close it out until we can.
Closing as insufficient data, since we don't have a reliable reproducer. If we cross it's path again, and can reproduce it, then we can reopen the bug at that time.
Clearing out old flags for reporting purposes. Chris Lalancette