Escalated to Bugzilla from IssueTracker
Hi We have just built 4 new DL580 G5 servers with Red Hat Enterprise Linux AS release 4 Update 6 and appear to be having an issue with the system time accelerating forward. To rule out the ntp time source I stopped the ntpd service, reset the system time and synchronised the hardware clock with the system time. After approximately 5 minutes the system time began to rapidly jumps forward(after 30min the time jumped 2 hours 5 minutes). This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
File uploaded: sosreport-lteena04-640901-7d1b11.tar.bz2 This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958 it_file 132940
Hi Sebastien, As requested I have temporarily removed the "clock=tsc" entry from the grub.conf file and restarted the server, I have monitored the system time on the box for the past 20 minutes and the time is still correct. Lee Internal Status set to 'Waiting on Support' Status set to: Waiting on Tech This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Hello, My customer uses a HP DL580 G5 with Xeon X7350 processor. This machine runs under RHEL4.6 and shortly after boot they see that system clock drifts forward. They have to use TSC as software clock and using another clock source isn't a reliable solution. I just asked them as a test purpose to boot with PIT and with this clock source system was fine. Apart from the sosreport attached, do you need something else to work on this issue ? Cheers, Sebastien. Issue escalated to Support Engineering Group by: saime. Internal Status set to 'Waiting on SEG' Summary edited. This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Sebastien With clock=tsc removed from the kernel command line, the DL580 G5 appears to default to HPET rather than PM-Timer. [root@ltedla02 ~]# zegrep -i "clock|timer|hpet|tsc|pmtmr" /var/log/messages.2.gz May 14 02:45:44 ltedla02 kernel: ServerWorks chipset detected. Disabling timer routing over 8254. May 14 02:45:44 ltedla02 kernel: ACPI: PM-Timer IO Port: 0x908 May 14 02:45:44 ltedla02 kernel: ACPI: HPET id: 0x8086a201 base: 0xfed00000 May 14 02:45:44 ltedla02 kernel: Using HPET for base-timer May 14 02:45:44 ltedla02 kernel: Using HPET for gettimeofday May 14 02:45:44 ltedla02 kernel: Using hpet for high-res timesource May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5870.28 BogoMIPS (lpj=2935142) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.72 BogoMIPS (lpj=2933362) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.85 BogoMIPS (lpj=2933428) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.68 BogoMIPS (lpj=2933340) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.62 BogoMIPS (lpj=2933311) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.77 BogoMIPS (lpj=2933387) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.66 BogoMIPS (lpj=2933334) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5964.63 BogoMIPS (lpj=2982315) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.67 BogoMIPS (lpj=2933335) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5981.83 BogoMIPS (lpj=2990916) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.80 BogoMIPS (lpj=2933400) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.84 BogoMIPS (lpj=2933422) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.73 BogoMIPS (lpj=2933368) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.68 BogoMIPS (lpj=2933340) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.73 BogoMIPS (lpj=2933368) May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific routine.. 5866.88 BogoMIPS (lpj=2933441) May 14 02:45:44 ltedla02 kernel: ..TIMER: vector=0x31 pin1=2 pin2=-1 May 14 02:45:44 ltedla02 kernel: checking TSC synchronization across 16 CPUs: May 14 02:45:44 ltedla02 kernel: CPU#7 had 1099354 usecs TSC skew, fixed it up. May 14 02:45:44 ltedla02 kernel: CPU#15 had 0 usecs TSC skew, fixed it up. May 14 02:45:45 ltedla02 kernel: Real Time Clock Driver v1.12 May 14 02:45:16 ltedla02 rc.sysinit: Setting clock (utc): Wed May 14 02:45:16 BST 2008 succeeded [root@ltedla02 ~]# However the software supplier claims to have tested TSC, HPET and pmtmr and claims that TSC still delivers significantly better performance. As this configuration is being used to size future configurations/purchases, we have been requested to try regressing to 4u5 with clock=tsc. We are about to try testing this configuration (personally I suspect we may have exactly the same problem) but unless you have some performance tests on comparative clock scalability that prove otherwise, I believe we will need 4u6 fixed to support clock=tsc "sooner rather than later". Regards, Phil This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Sebastien I've installed kernel-smp 2.6.9-55.0.12 (the latest RHEL 4u5 errata SMP kernel) on ltedla02 and configured this to boot with "clock=tsc" as discussed. I've also repeated a clock reset sequence on all four EMDS Integration 2 systems to ensure the system times at least start from a known position. First impressions are the 4u5 suffers the same problem. allboxes@sphssa03:/export/home/allboxes $ for X in ltedla02 ltedla03 lteena03 lteena04; do echo -e "\\n<<<$X>>>"; ssh $X "date; uname -a; cat /proc/cmdline; ntpq -np"; date; done <<<ltedla02>>> Wed May 21 16:50:19 BST 2008 Linux ltedla02 2.6.9-55.0.12.ELsmp #1 SMP Wed Oct 17 08:19:30 EDT 2007 i686 i686 i386 GNU/Linux ro root=LABEL=/1 console=tty0 console=ttyS1,115200 clock=tsc remote refid st t when poll reach delay offset jitter ============================================================================== 10.230.202.1 10.230.31.255 5 u 27 64 377 0.219 1.805 636306. Wed May 21 16:30:19 BST 2008 <<<ltedla03>>> Wed May 21 17:35:39 BST 2008 Linux ltedla03 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686 i686 i386 GNU/Linux ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc remote refid st t when poll reach delay offset jitter ============================================================================== 10.230.202.1 10.230.31.255 5 u 649 64 377 0.242 -299919 346306. Wed May 21 16:30:40 BST 2008 <<<lteena03>>> Wed May 21 16:46:01 BST 2008 Linux lteena03 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686 i686 i386 GNU/Linux ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc remote refid st t when poll reach delay offset jitter ============================================================================== 10.230.202.1 10.230.31.255 5 - 9 64 377 0.240 -899706 636211. Wed May 21 16:31:01 BST 2008 <<<lteena04>>> Wed May 21 17:11:21 BST 2008 Linux lteena04 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686 i686 i386 GNU/Linux ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc remote refid st t when poll reach delay offset jitter ============================================================================== 10.230.202.1 10.230.31.255 5 u 27 64 377 0.257 -239932 1481485 Wed May 21 16:31:21 BST 2008 allboxes@sphssa03:/export/home/allboxes $ Do you believe that doing a full 4u5 kickstart will change the behaviour of clock=tsc on a DL580 G5? Regards, Phil This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
--- System reserved --- ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** This System is reserved by saime. To return this system early. You can run the command: return2rhts.sh Ensure you have your logs off the system before returning to RHTS To extend your reservation time. You can run the command: extendtesttime.sh This is an interactive script. You will be prompted for how many hours you would like to extend the reservation. Please use this command responsibly, Everyone uses these machines. For ssh, kvm, serial and power control operations please look here: http://intranet.corp.redhat.com/ic/intranet/RHTSSystemList.html RHTS Test information: HOSTNAME=hp-dl580g5-01.rhts.bos.redhat.com JOBID=23504 RECIPEID=83901 LAB_SERVER=rhts.redhat.com RESULT_SERVER=rhts.redhat.com DISTRO=RHEL4-U6 ARCHITECTURE=i386 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
--- Internal Note --- By mistake I've installed kernel-2.6.9-70.EL (not smp) and let the machine runnning with it few hours. Results : [root@hp-dl580g5-01 ~]# hwclock Thu 12 Jun 2008 11:58:07 PM EDT -0.015999 seconds [root@hp-dl580g5-01 ~]# date Fri Jun 13 23:03:13 EDT 2008 [root@hp-dl580g5-01 ~]# [root@hp-dl580g5-01 ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== 10.16.71.254 66.187.224.4 2 u 1809 1024 377 0.510 -154753 2999118 *LOCAL(0) 73.78.73.84 5 l 25 64 377 0.000 0.000 0.001 I've seen dozens of messages like this int /var/log/messages : Jun 13 05:04:40 hp-dl580g5-01 kernel: set_rtc_mmss: can't update from 54 to 4 This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
It seems to be drifting quickly. Let's do a more accurate test. @ 4:33pm BST # service ntpd stop Shutting down ntpd: [ OK ] # ntpdate clock.redhat.com 12 Jun 11:33:22 ntpdate[5374]: step time server 66.187.224.4 offset -899.758768 sec # service ntpd start ntpd: Synchronizing with time server: [ OK ] Starting ntpd: [ OK ] This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
@ 9:12 am BST [root@hp-dl580g5-01 ~]# date Sun Jun 15 15:21:17 EDT 2008 [root@hp-dl580g5-01 ~]# hwclock Fri 13 Jun 2008 11:46:20 AM EDT -0.016410 seconds [root@hp-dl580g5-01 ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== 10.16.71.254 66.187.224.4 2 u 39m 1024 377 1.602 -210839 2399276 *LOCAL(0) 73.78.73.84 5 l 15 64 377 0.000 0.000 0.001 This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Curious... the drift seems to be happening in increments of 300 seconds. No idea why, at present. This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Drift is happening more frequently than every 5 minutes, so we can rule out periodic events like SMIs, but it's still highly suspicious that the error is always just short of an exact multiple of 300 seconds. (This is on RHEL 4.6 i386, with nptd and cpuspeed disabled.) This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Clock is completely stable on RHEL 4.6 x86_64, with and without clock=tsc. This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Problem reproduces with ntpd, cpuspeed, and acpid disabled. Moving to x86_64 fixes it, as does moving to RHEL 5 i686. This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Problem reproduces on UP kernel as well. This event sent from IssueTracker by csnook [SEG - Kernel] issue 179958
Escalation summary: When running RHEL 4 i686 on an HP DL580 G5, time spontaneously (and frequently) skips ahead in increments of 300 seconds if the box is booted with 'clock=tsc', even when the system is completely idle. This is independent of ntpd, irqbalance, acpid, and UP/SMP kernel. RHEL 4 x86_64 does not exhibit this problem, nor does RHEL 5 i686.
This is just a shot in the dark, but I bet this is related somehow: include/linux/time.h line 40 /* * Have the 32 bit jiffies value wrap 5 minutes after boot * so jiffies wrap bugs show up earlier. */ #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
How many entries are in /proc/cpuinfo (i.e. how many CPU cores)? Also, I assume this is an AMD based system? If so, the "powernow_k8.tscsync=1" flag must be used to prevent TSC drift.
This is a quad-core, quad-socket Intel system. /proc/cpuinfo shows all 16 cores under the smp kernel. The problem reproduces under the UP kernel as well.
Is the skew with SMP the same as the skew on UP? If the system is still reserved, perhaps I can debug this instead of asking for debug info.
The skew is random, but it's still in 300-second increments when it happens, and it seems to occur just as frequently.
Hi, Do you have some news regarding this issue ? Thanks, Seb. This event sent from IssueTracker by saime issue 179958
Continuing to debug this... Looks like we are using HPET as a base timer and TSC for high-res timer. Does hpet=disable boot arg make a difference?
Hi, Thank you for this update. Instead of asking to my customer to do this test, I'll do it myself by reserving hp-dl580g5-01.rhts.bos.redhat.com in RHTS. As far as this time drift issue is fairly easy to reproduce I expect to get the result of the test you asked today, depending on the availability of the machine. If I can't get access to it today, I'll ask to my customer to do the test. Thanks, Sebastien. This event sent from IssueTracker by saime issue 179958
Hi, I got access to hp-dl580g5-01.rhts.bos.redhat.com. To reproduce the issue, start the machine with 'clock=tsc' kernel parameter. Even with ntpd running, time drift occurs quickly : # for i in $(seq 1 10); do date ; sleep 60 ; done Thu Jul 24 04:48:36 EDT 2008 Thu Jul 24 04:53:58 EDT 2008 Thu Jul 24 04:54:58 EDT 2008 Thu Jul 24 05:00:23 EDT 2008 Thu Jul 24 05:01:23 EDT 2008 Thu Jul 24 05:02:23 EDT 2008 Thu Jul 24 05:03:23 EDT 2008 Thu Jul 24 05:09:06 EDT 2008 Thu Jul 24 05:10:06 EDT 2008 Thu Jul 24 05:11:06 EDT 2008 Now let's see what happens if you hpet=disable in combination with clock=tsc. I boot the machine with this line : ... kernel /vmlinuz-2.6.9-67.ELsmp ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200 clock=tsc hpet=disable ... few minutes of break... Now let's check the clock : # for i in $(seq 1 10); do date ; sleep 60 ; done Thu Jul 24 05:11:55 EDT 2008 Thu Jul 24 05:12:55 EDT 2008 Thu Jul 24 05:13:55 EDT 2008 Thu Jul 24 05:14:55 EDT 2008 Thu Jul 24 05:15:55 EDT 2008 Thu Jul 24 05:16:55 EDT 2008 Thu Jul 24 05:17:55 EDT 2008 Thu Jul 24 05:18:55 EDT 2008 Thu Jul 24 05:19:55 EDT 2008 Thu Jul 24 05:20:55 EDT 2008 It looks far better. I'll monitor the machine few hours to make sure system clock is consistent. Now I'd like to know something. Customer is currently running DL580 G5 with 'clock=pit' but it's not suitable for them. Can I suggest them to use 'clock=tsc' and 'hpet=disable' as a better workaround ? What are the drawbacks of this configuration ? Thanks, Sebastien. This event sent from IssueTracker by saime issue 179958
Hi, Few hours later, let's check the system clock. # for i in $(seq 1 10); do date ; sleep 60 ; done Thu Jul 24 09:23:16 EDT 2008 Thu Jul 24 09:24:16 EDT 2008 Thu Jul 24 09:25:16 EDT 2008 Thu Jul 24 09:26:16 EDT 2008 Thu Jul 24 09:27:16 EDT 2008 Thu Jul 24 09:28:16 EDT 2008 Thu Jul 24 09:29:16 EDT 2008 Thu Jul 24 09:30:16 EDT 2008 Thu Jul 24 09:31:16 EDT 2008 Thu Jul 24 09:32:16 EDT 2008 It seems that your suggestion helps a lot. Thanks, Sebastien. This event sent from IssueTracker by saime issue 179958
> Can I suggest them to use 'clock=tsc' and 'hpet=disable' as a better workaround Yes, its a good workaround with no real drawbacks. The issue here seems to be that the base timer and TSC get out of sync.
Updating PM score.
Created attachment 328479 [details] production ready patch that resolves the issue patch to properly account for hpet rollover. this patch corrects the computed offset when hpet rolls over (300 second skew) and additionally fixes a 1 second per hour drift in the lost_ticks accounting.
Committed in 78.29.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html