Bug 451827

Summary: [rhel 4.6] System Time drifts forward with TSC
Product: Red Hat Enterprise Linux 4 Reporter: Issue Tracker <tao>
Component: kernelAssignee: Brian Maly <bmaly>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: mpoole, prarit, tao, vgoyal
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:15:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 461297    
Attachments:
Description Flags
production ready patch that resolves the issue none

Description Issue Tracker 2008-06-17 17:25:24 UTC
Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2008-06-17 17:25:26 UTC
Hi 
We have just built 4 new DL580 G5 servers with Red Hat Enterprise Linux AS release 4 Update 6 and appear to be having an issue with the system time accelerating forward.
To rule out the ntp time source I stopped the ntpd service, reset the system time and synchronised the hardware clock with the system time. After approximately 5 minutes the system time began to rapidly jumps forward(after 30min the time jumped 2 hours 5 minutes).
This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 2 Issue Tracker 2008-06-17 17:25:27 UTC
File uploaded: sosreport-lteena04-640901-7d1b11.tar.bz2
This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958
it_file 132940

Comment 3 Issue Tracker 2008-06-17 17:25:28 UTC
Hi Sebastien,

As requested I have temporarily removed the "clock=tsc" entry from the
grub.conf file and restarted the server, I have monitored the system time
on the box for the past 20 minutes and the time is still correct.

Lee

Internal Status set to 'Waiting on Support'
Status set to: Waiting on Tech

This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 4 Issue Tracker 2008-06-17 17:25:29 UTC
Hello,

My customer uses a HP DL580 G5 with Xeon X7350 processor. This machine
runs under RHEL4.6 and shortly after boot they see that system clock
drifts forward.
They have to use TSC as software clock and using another clock source
isn't a reliable solution. I just asked them as a test purpose to boot
with PIT and with this clock source system was fine.
Apart from the sosreport attached, do you need something else to work on
this issue ?

Cheers,

Sebastien.


Issue escalated to Support Engineering Group by: saime.
Internal Status set to 'Waiting on SEG'
Summary edited.

This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 5 Issue Tracker 2008-06-17 17:25:30 UTC
Sebastien

With clock=tsc removed from the kernel command line, the DL580 G5 appears
to default to HPET rather than PM-Timer.

[root@ltedla02 ~]# zegrep -i "clock|timer|hpet|tsc|pmtmr"
/var/log/messages.2.gz
May 14 02:45:44 ltedla02 kernel: ServerWorks chipset detected. Disabling
timer routing over 8254.
May 14 02:45:44 ltedla02 kernel: ACPI: PM-Timer IO Port: 0x908
May 14 02:45:44 ltedla02 kernel: ACPI: HPET id: 0x8086a201 base:
0xfed00000
May 14 02:45:44 ltedla02 kernel: Using HPET for base-timer
May 14 02:45:44 ltedla02 kernel: Using HPET for gettimeofday
May 14 02:45:44 ltedla02 kernel: Using hpet for high-res timesource
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5870.28 BogoMIPS (lpj=2935142)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.72 BogoMIPS (lpj=2933362)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.85 BogoMIPS (lpj=2933428)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.68 BogoMIPS (lpj=2933340)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.62 BogoMIPS (lpj=2933311)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.77 BogoMIPS (lpj=2933387)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.66 BogoMIPS (lpj=2933334)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5964.63 BogoMIPS (lpj=2982315)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.67 BogoMIPS (lpj=2933335)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5981.83 BogoMIPS (lpj=2990916)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.80 BogoMIPS (lpj=2933400)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.84 BogoMIPS (lpj=2933422)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.73 BogoMIPS (lpj=2933368)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.68 BogoMIPS (lpj=2933340)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.73 BogoMIPS (lpj=2933368)
May 14 02:45:44 ltedla02 kernel: Calibrating delay using timer specific
routine.. 5866.88 BogoMIPS (lpj=2933441)
May 14 02:45:44 ltedla02 kernel: ..TIMER: vector=0x31 pin1=2 pin2=-1
May 14 02:45:44 ltedla02 kernel: checking TSC synchronization across 16
CPUs:
May 14 02:45:44 ltedla02 kernel: CPU#7 had 1099354 usecs TSC skew, fixed
it up.
May 14 02:45:44 ltedla02 kernel: CPU#15 had 0 usecs TSC skew, fixed it
up.
May 14 02:45:45 ltedla02 kernel: Real Time Clock Driver v1.12
May 14 02:45:16 ltedla02 rc.sysinit: Setting clock  (utc): Wed May 14
02:45:16 BST 2008 succeeded
[root@ltedla02 ~]#

However the software supplier claims to have tested TSC, HPET and pmtmr
and claims that TSC still delivers significantly better performance.

As this configuration is being used to size future
configurations/purchases, we have been requested to try regressing to 4u5
with clock=tsc. We are about to try testing this configuration (personally
I suspect we may have exactly the same problem) but unless you have some
performance tests on comparative clock scalability that prove otherwise, I
believe we will need 4u6 fixed to support clock=tsc "sooner rather than
later".

Regards, Phil


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 6 Issue Tracker 2008-06-17 17:25:31 UTC
Sebastien

I've installed kernel-smp 2.6.9-55.0.12 (the latest RHEL 4u5 errata SMP
kernel) on ltedla02 and configured this to boot with "clock=tsc" as
discussed.
 
I've also repeated a clock reset sequence on all four EMDS Integration 2
systems to ensure the system times at least start from a known position.

First impressions are the 4u5 suffers the same problem.

allboxes@sphssa03:/export/home/allboxes $ for X in ltedla02 ltedla03
lteena03 lteena04; do echo -e "\\n<<<$X>>>"; ssh $X "date; uname -a;
cat /proc/cmdline; ntpq -np"; date; done

<<<ltedla02>>>
Wed May 21 16:50:19 BST 2008
Linux ltedla02 2.6.9-55.0.12.ELsmp #1 SMP Wed Oct 17 08:19:30 EDT 2007
i686 i686 i386 GNU/Linux
ro root=LABEL=/1 console=tty0 console=ttyS1,115200 clock=tsc
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.230.202.1    10.230.31.255    5 u   27   64  377    0.219    1.805
636306.
Wed May 21 16:30:19 BST 2008

<<<ltedla03>>>
Wed May 21 17:35:39 BST 2008
Linux ltedla03 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686
i686 i386 GNU/Linux
ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.230.202.1    10.230.31.255    5 u  649   64  377    0.242  -299919
346306.
Wed May 21 16:30:40 BST 2008

<<<lteena03>>>
Wed May 21 16:46:01 BST 2008
Linux lteena03 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686
i686 i386 GNU/Linux
ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.230.202.1    10.230.31.255    5 -    9   64  377    0.240  -899706
636211.
Wed May 21 16:31:01 BST 2008

<<<lteena04>>>
Wed May 21 17:11:21 BST 2008
Linux lteena04 2.6.9-67.0.4.ELsmp #1 SMP Fri Jan 18 05:00:58 EST 2008 i686
i686 i386 GNU/Linux
ro root=LABEL=/ console=tty0 console=ttyS1,115200 clock=tsc
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.230.202.1    10.230.31.255    5 u   27   64  377    0.257  -239932
1481485
Wed May 21 16:31:21 BST 2008
allboxes@sphssa03:/export/home/allboxes $

Do you believe that doing a full 4u5 kickstart will change the behaviour
of clock=tsc on a DL580 G5?

Regards, Phil


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 7 Issue Tracker 2008-06-17 17:25:32 UTC
--- System reserved ---

**  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **
                 This System is reserved by saime.             
 
                                                                      
 To return this system early. You can run the command: return2rhts.sh 
  Ensure you have your logs off the system before returning to RHTS   
                                                                      
 To extend your reservation time. You can run the command:            
  extendtesttime.sh                                                   
 This is an interactive script. You will be prompted for how many     
  hours you would like to extend the reservation.                     
  Please use this command responsibly, Everyone uses these machines.  
                                                                      
 For ssh, kvm, serial and power control operations please look here:  
  http://intranet.corp.redhat.com/ic/intranet/RHTSSystemList.html     
                                                                      
                                                                      
      RHTS Test information:                                          
                         HOSTNAME=hp-dl580g5-01.rhts.bos.redhat.com       
                   
                            JOBID=23504                              
                         RECIPEID=83901                           
                       LAB_SERVER=rhts.redhat.com                        

                    RESULT_SERVER=rhts.redhat.com                      
                           DISTRO=RHEL4-U6                             
                     ARCHITECTURE=i386                               
**  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **



This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 8 Issue Tracker 2008-06-17 17:25:33 UTC
--- Internal Note ---

By mistake I've installed kernel-2.6.9-70.EL (not smp) and let the
machine runnning with it few hours.

Results :
[root@hp-dl580g5-01 ~]# hwclock
Thu 12 Jun 2008 11:58:07 PM EDT  -0.015999 seconds
[root@hp-dl580g5-01 ~]# date
Fri Jun 13 23:03:13 EDT 2008
[root@hp-dl580g5-01 ~]# 
[root@hp-dl580g5-01 ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.16.71.254    66.187.224.4     2 u 1809 1024  377    0.510  -154753
2999118
*LOCAL(0)        73.78.73.84      5 l   25   64  377    0.000    0.000  
0.001

I've seen dozens of messages like this int /var/log/messages :

Jun 13 05:04:40 hp-dl580g5-01 kernel: set_rtc_mmss: can't update from 54
to 4


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 9 Issue Tracker 2008-06-17 17:25:34 UTC
It seems to be drifting quickly. Let's do a more accurate test.

@ 4:33pm BST

# service ntpd stop
Shutting down ntpd: [  OK  ]
# ntpdate clock.redhat.com
12 Jun 11:33:22 ntpdate[5374]: step time server 66.187.224.4 offset
-899.758768 sec
# service ntpd start
ntpd: Synchronizing with time server: [  OK  ]
Starting ntpd: [  OK  ]




This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 10 Issue Tracker 2008-06-17 17:25:35 UTC
@ 9:12 am BST

[root@hp-dl580g5-01 ~]# date
Sun Jun 15 15:21:17 EDT 2008
[root@hp-dl580g5-01 ~]# hwclock
Fri 13 Jun 2008 11:46:20 AM EDT  -0.016410 seconds
[root@hp-dl580g5-01 ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset 
jitter
==============================================================================
 10.16.71.254    66.187.224.4     2 u  39m 1024  377    1.602  -210839
2399276
*LOCAL(0)        73.78.73.84      5 l   15   64  377    0.000    0.000  
0.001



This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 11 Issue Tracker 2008-06-17 17:25:36 UTC
Curious... the drift seems to be happening in increments of 300 seconds. 
No idea why, at present.


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 12 Issue Tracker 2008-06-17 17:25:37 UTC
Drift is happening more frequently than every 5 minutes, so we can rule out
periodic events like SMIs, but it's still highly suspicious that the error
is always just short of an exact multiple of 300 seconds.

(This is on RHEL 4.6 i386, with nptd and cpuspeed disabled.)



This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 13 Issue Tracker 2008-06-17 17:25:38 UTC
Clock is completely stable on RHEL 4.6 x86_64, with and without clock=tsc.


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 14 Issue Tracker 2008-06-17 17:25:39 UTC
Problem reproduces with ntpd, cpuspeed, and acpid disabled.  Moving to
x86_64 fixes it, as does moving to RHEL 5 i686.


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 15 Issue Tracker 2008-06-17 17:25:40 UTC
Problem reproduces on UP kernel as well.


This event sent from IssueTracker by csnook  [SEG - Kernel]
 issue 179958

Comment 16 Chris Snook 2008-06-17 17:32:11 UTC
Escalation summary:

When running RHEL 4 i686 on an HP DL580 G5, time spontaneously (and frequently)
skips ahead in increments of 300 seconds if the box is booted with 'clock=tsc',
even when the system is completely idle.  This is independent of ntpd,
irqbalance, acpid, and UP/SMP kernel.  RHEL 4 x86_64 does not exhibit this
problem, nor does RHEL 5 i686.

Comment 17 Chris Snook 2008-06-18 16:17:36 UTC
This is just a shot in the dark, but I bet this is related somehow:

include/linux/time.h line 40
/*
 * Have the 32 bit jiffies value wrap 5 minutes after boot
 * so jiffies wrap bugs show up earlier.
 */
#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))


Comment 18 Brian Maly 2008-06-18 16:43:19 UTC
How many entries are in /proc/cpuinfo (i.e. how many CPU cores)?

Also, I assume this is an AMD based system? If so, the "powernow_k8.tscsync=1"
flag must be used to prevent TSC drift.



Comment 19 Chris Snook 2008-06-18 21:23:37 UTC
This is a quad-core, quad-socket Intel system.  /proc/cpuinfo shows all 16 cores
under the smp kernel.  The problem reproduces under the UP kernel as well.

Comment 20 Brian Maly 2008-06-18 22:09:36 UTC

Is the skew with SMP the same as the skew on UP? 

If the system is still reserved, perhaps I can debug this instead of asking for
debug info.

Comment 21 Chris Snook 2008-06-19 16:45:58 UTC
The skew is random, but it's still in 300-second increments when it happens, and
it seems to occur just as frequently.

Comment 22 Issue Tracker 2008-07-08 10:29:52 UTC
Hi,

Do you have some news regarding this issue ?

Thanks,

Seb.


This event sent from IssueTracker by saime 
 issue 179958

Comment 25 Brian Maly 2008-07-24 04:33:03 UTC
Continuing to debug this...

Looks like we are using HPET as a base timer and TSC for high-res timer. 

Does hpet=disable boot arg make a difference? 

 

Comment 26 Issue Tracker 2008-07-24 08:11:34 UTC
Hi,

Thank you for this update. Instead of asking to my customer to do this
test, I'll do it myself by reserving hp-dl580g5-01.rhts.bos.redhat.com in
RHTS. As far as this time drift issue is fairly easy to reproduce I expect
to get the result of the test you asked today, depending on the
availability of the machine.

If I can't get access to it today, I'll ask to my customer to do the
test.

Thanks,

Sebastien.


This event sent from IssueTracker by saime 
 issue 179958

Comment 27 Issue Tracker 2008-07-24 09:25:39 UTC
Hi,

I got access to hp-dl580g5-01.rhts.bos.redhat.com. To reproduce the issue,
start the machine with 'clock=tsc' kernel parameter. Even with ntpd
running, time drift occurs quickly :

# for i in $(seq 1 10); do date ; sleep 60 ; done
Thu Jul 24 04:48:36 EDT 2008
Thu Jul 24 04:53:58 EDT 2008
Thu Jul 24 04:54:58 EDT 2008
Thu Jul 24 05:00:23 EDT 2008
Thu Jul 24 05:01:23 EDT 2008
Thu Jul 24 05:02:23 EDT 2008
Thu Jul 24 05:03:23 EDT 2008
Thu Jul 24 05:09:06 EDT 2008
Thu Jul 24 05:10:06 EDT 2008
Thu Jul 24 05:11:06 EDT 2008

Now let's see what happens if you hpet=disable in combination with
clock=tsc.
I boot the machine with this line :

...
kernel /vmlinuz-2.6.9-67.ELsmp ro root=/dev/VolGroup00/LogVol00
console=ttyS0,115200 clock=tsc hpet=disable
...

few minutes of break...

Now let's check the clock :

# for i in $(seq 1 10); do date ; sleep 60 ; done
Thu Jul 24 05:11:55 EDT 2008
Thu Jul 24 05:12:55 EDT 2008
Thu Jul 24 05:13:55 EDT 2008
Thu Jul 24 05:14:55 EDT 2008
Thu Jul 24 05:15:55 EDT 2008
Thu Jul 24 05:16:55 EDT 2008
Thu Jul 24 05:17:55 EDT 2008
Thu Jul 24 05:18:55 EDT 2008
Thu Jul 24 05:19:55 EDT 2008
Thu Jul 24 05:20:55 EDT 2008

It looks far better. I'll monitor the machine few hours to make sure
system clock is consistent.

Now I'd like to know something. Customer is currently running DL580 G5
with 'clock=pit' but it's not suitable for them. Can I suggest them to use
'clock=tsc' and 'hpet=disable' as a better workaround ? What are the
drawbacks of this configuration ?

Thanks,

Sebastien.


This event sent from IssueTracker by saime 
 issue 179958

Comment 28 Issue Tracker 2008-07-24 13:36:44 UTC
Hi,

Few hours later, let's check the system clock.

# for i in $(seq 1 10); do date ; sleep 60 ; done
Thu Jul 24 09:23:16 EDT 2008
Thu Jul 24 09:24:16 EDT 2008
Thu Jul 24 09:25:16 EDT 2008
Thu Jul 24 09:26:16 EDT 2008
Thu Jul 24 09:27:16 EDT 2008
Thu Jul 24 09:28:16 EDT 2008
Thu Jul 24 09:29:16 EDT 2008
Thu Jul 24 09:30:16 EDT 2008
Thu Jul 24 09:31:16 EDT 2008
Thu Jul 24 09:32:16 EDT 2008

It seems that your suggestion helps a lot.

Thanks,

Sebastien.


This event sent from IssueTracker by saime 
 issue 179958

Comment 29 Brian Maly 2008-07-24 19:13:04 UTC
> Can I suggest them to use 'clock=tsc' and 'hpet=disable' as a better workaround 

Yes, its a good workaround with no real drawbacks. The issue here seems to be
that the base timer and TSC get out of sync. 

Comment 30 RHEL Program Management 2008-09-03 13:00:32 UTC
Updating PM score.

Comment 33 Brian Maly 2009-01-08 18:18:05 UTC
Created attachment 328479 [details]
production ready patch that resolves the issue

patch to properly account for hpet rollover. this patch corrects the computed offset when hpet rolls over (300 second skew) and additionally fixes a 1 second per hour drift in the lost_ticks accounting.

Comment 35 Vivek Goyal 2009-01-15 14:03:26 UTC
Committed in 78.29.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 40 errata-xmlrpc 2009-05-18 19:15:26 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html