Bug 110999 - clock is running to fast on IBM x445
clock is running to fast on IBM x445
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Doug Ledford
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-11-26 08:43 EST by Sebastian Wenner
Modified: 2007-11-30 17:06 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-05-11 21:07:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lsmod (1.02 KB, text/plain)
2003-11-26 08:45 EST, Sebastian Wenner
no flags Details
/var/log/messages (27.84 KB, text/plain)
2003-11-26 08:48 EST, Sebastian Wenner
no flags Details
Good dmesg, no clock drift (24.30 KB, text/plain)
2004-01-24 15:34 EST, Jim Richard
no flags Details
Messages.log documenting clock drift on boot (593.69 KB, text/plain)
2004-01-24 15:36 EST, Jim Richard
no flags Details
DMESG from runaway clock after boot # 152 (30.45 KB, text/plain)
2004-01-31 20:39 EST, Jim Richard
no flags Details
lsmod - runaway clock boot # 152 (1.01 KB, text/plain)
2004-01-31 20:40 EST, Jim Richard
no flags Details
Contents of /proc/interrupts - runaway clock boot #152 (1.69 KB, text/plain)
2004-01-31 20:42 EST, Jim Richard
no flags Details

  None (edit)
Description Sebastian Wenner 2003-11-26 08:43:45 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5)
Gecko/20030925

Description of problem:
The clock on our IBM x445 is running to fast. The hardware clock is
running ok, but the system clock is running way to fast. One minute in
real life are about eight (!) minutes on the machine.

Here's a output of a ntpdate sync:

[root@holldbv1 root]# ntpdate cws1
26 Nov 14:14:00 ntpdate[2960]: step time server 10.170.26.105 offset
-11.339696 sec
[root@holldbv1 root]# ntpdate cws1
26 Nov 14:15:01 ntpdate[2961]: step time server 10.170.26.105 offset
-429.999019 sec

As you can see here, the machine was about 7 minutes ahead ...

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install RHAS 3.o in text mode
2. type date variuos times or sync to ntp-server
    

Additional info:

This is happening, if I install with text mode and the onboard scsi
controller.

I tested various other posibilites:

- Same system, graphic install -> time is running even faster, no
login possible
- ServeRaid 4Lx instead of onboard scsi used -> smp/hugemem is not
booting, kernel panic
Comment 1 Sebastian Wenner 2003-11-26 08:45:29 EST
Created attachment 96211 [details]
lsmod
Comment 2 Arjan van de Ven 2003-11-26 08:45:56 EST
what does /proc/interrupts look like ?
does irqbalance run ?
how many cpus are there?
can you attach a full boot log ?
Comment 3 Sebastian Wenner 2003-11-26 08:48:17 EST
Created attachment 96212 [details]
/var/log/messages
Comment 4 Tim Burke 2003-12-15 20:20:11 EST
We included a change in the RHEL3 U1 (just out for beta today) which
utilizes the tsc to synchronize clocks among processors.  I wonder if
that would help the situation.
Comment 5 Tim Burke 2003-12-15 20:20:55 EST
Oh, sorry, I'm confused. That tsc change was for the x450 the IPF box.
Comment 6 Jim Richard 2004-01-24 15:34:24 EST
Created attachment 97239 [details]
Good dmesg, no clock drift
Comment 7 Jim Richard 2004-01-24 15:36:48 EST
Created attachment 97240 [details]
Messages.log documenting clock drift on boot
Comment 8 Jim Richard 2004-01-24 15:46:34 EST
Seeing same problem here... See two attachments. Problem here is 
intermittent, ~= 1 in 16 boots. the message log above the drift 
starts on the boot beginning at: Jan 20 14:15:24 and running through 
Jan 20 15:05:43, you'll see just prior to shutdown I reset the clock 
to Jan 20 14:25:41. So in about 10 minutes of real world time the 
clock incremented by 50 minutes ( ~= 5 seconds for each real world 
second).

I've updated one machine to update1 and havent seen the problem there 
yet. We saw the problem with both the AS 3.0 base kernel and the last 
errata kernel prior to update 1, vmlinuz-2.4.21-4.ELsmp, and vmlinuz-
2.4.21-4.0.2.ELsmp respectively. 

Comment 9 Jim Richard 2004-01-30 23:31:08 EST
FYI: We saw it again last night on Update 1. 
BTW: Is this related to the bug described in Bug#110170?
Comment 10 Jim Richard 2004-01-31 20:39:36 EST
Created attachment 97396 [details]
DMESG from runaway clock after boot # 152 

Runaway clock, /var/log/dmesg
Comment 11 Jim Richard 2004-01-31 20:40:38 EST
Created attachment 97397 [details]
lsmod - runaway clock boot # 152

Load modules from boot #152
Comment 12 Jim Richard 2004-01-31 20:42:03 EST
Created attachment 97398 [details]
Contents of /proc/interrupts - runaway clock boot #152

/proc/interrupts - on runaway clock on boot #152
Comment 13 Jim Richard 2004-01-31 20:53:55 EST
Clock for last 3 attachments was running about 4 seconds for every 1 
second of real world time.

Determined using the following procedure


ntpdate localntpserver.localdomain.com     # sync to my local ntp
sleep 60            # sleep 60 runaway seconds
date                # display current date time
ntpdate localntpserver.localdomain.com     # sync to my local ntp
date                # display updated date time

Look at offset for last ntpdate
and last date output

Here's the output:

31 Jan 20:39:27 ntpdate[4235]: step time server 10.0.0.67 offset -
146.105867 sec
...(sleep 60 here)
Sat Jan 31 20:40:27 EST 2004
31 Jan 20:39:42 ntpdate[4239]: step time server 10.0.0.67 offset -
45.107956 sec
Sat Jan 31 20:39:42 EST 2004
Comment 14 Sebastian Wenner 2004-02-02 04:05:59 EST
This seem to be directly bound to the use of hyperthreading.
If we turn on hyperthreading in the bios, the effect vanishes.
Comment 15 Sebastian Wenner 2004-04-28 17:38:20 EDT
[sarcasm]
Thanks for your great help guys!
I really can see the effort you are putting into an _enterprise_ class
linux distributions and the support connected to it.
This response time really gives me the confidence to advice customers
to build their environment on an redhat distribution...
[/sarcasm]

*** This bug has been marked as a duplicate of 110170 ***
Comment 16 John Flanagan 2004-05-11 21:07:50 EDT
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2004-188.html

Note You need to log in before you can comment on or make changes to this bug.