Bug 163347 - Clock runs twice as fast as expected
Clock runs twice as fast as expected
Status: CLOSED DUPLICATE of bug 173236
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Jim Paradis
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-15 08:54 EDT by John Haxby
Modified: 2013-08-05 21:15 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-23 17:44:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description John Haxby 2005-07-15 08:54:09 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.7.8) Gecko/20050512 Red Hat/1.0.4-1.4.1 Firefox/1.0.4

Description of problem:
The clock runs exactly twice as fast as expected.   This looks like bug 152630 except that the 2.6.7 fix mentioned early on in that bug has already been applied (at least to the kernel I was looking at).

Version-Release number of selected component (if applicable):
kernel-2.6.9-11.EL.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Boot machine
2. "watch -n1 date"
3. (from another machine) time ssh <host> sleep 100

Actual Results:  See time running at twice normal rate.  The ssh command takes a shade over 50 seconds.

Expected Results:  Time should run normally.

Additional info:

This is a Compaq Presario SR1440UK (http://tinyurl.com/d3zzx) with an ATI Radeon Xpress 200 chipset (just like bug 152630), but, as I said, the 2.6.7 patch appears to have already been applied.   I get the same problem with the 2.6.13-rc3 kernel.

I have hacked a workaround: in kernel/timer.c, update_times now has "ticks>>1".  Time now runs normally which leads me to believe that I'm just getting twice as many interrupts as I should get.   (Indeed, on the unpatched kernel, looking at /proc/interrupts either side of a sleep 10 shows 10000 interrupts, exactly as expected except that, of course, it only takes five seconds.)
Comment 1 John Haxby 2005-07-15 11:20:50 EDT
Many thanks to Milan Keršláger who pointed me at the no_timer_check in 2.6.12
(it's in arch/x86_64/kernel/io_apic.c) which, if I understand it correctly,
turns off the IO-APIC timer interrupt so that we just get the interrupt through
IRQ0 (which should have been disabled -- hardware bug, I guess).
Comment 2 Kyle Gonzales 2005-07-30 01:35:42 EDT
I've verified this as well.  Seems to be an issue with ATI-based
Athlon64/Opteron/Turion64 motherboards.

Any chance we can backport the no_timer_check from FC4 into RHEL4U3?  Is it
something that is difficult, or breaks kABI?
Comment 3 Alexandre Oliva 2005-08-22 10:01:28 EDT
It's probably worth noting that the clock sometimes runs too fast on boxes with
the nForce chipset, such as my Compaq Presario R3004US.  The difference is that,
whenever it happened it me, the clock was not 2, but 3 times as fast as it
should.  A number of people have reported similar problems on mailing lists
devoted to running GNU/Linux on the r3000z series.
Comment 4 linuxacct 2005-08-26 04:12:56 EDT
FWIW, this is also present in late 2.4 kernel:
Linux testbox 2.4.21-27.ELsmp #1 SMP Wed Dec 1 21:59:02 EST 2004 i686 i686 i386
GNU/Linux

I have this kernel running on ~15 PCs, 1 opteron and the rest Intel xeon and P4
chips.  Only two PCs exhibit this problem:  Both are IBMs with xeon 2.8Ghz HT
cpus.  The only property unique to the 2 afflicted is both have 3rd party pci
scsi cards.

System 1:
# lspci
00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub Interface
(rev 02)
00:02.0 VGA compatible controller: Intel Corp. 82865G Integrated Graphics Device
(rev 02)
00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#1 (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#2 (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02)
00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#4 (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
(rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
(rev 02)
00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) AC'97
Audio Controller (rev 02)
03:0b.0 Ethernet controller: Intel Corp. 82541EI Gigabit Ethernet Controller
(Copper)
03:0c.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U

# cat /proc/interrupts ; sleep 10; cat /proc/interrupts 
           CPU0       CPU1       
  0:   50450426   52259012    IO-APIC-edge  timer
  1:         13         17    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          1          0    IO-APIC-edge  rtc
 12:        610        802    IO-APIC-edge  PS/2 Mouse
 14:    2308811    3072192    IO-APIC-edge  ide0
 15:     993140     978828    IO-APIC-edge  ide1
 16:   81799327          0   IO-APIC-level  usb-uhci, usb-uhci, eth0
 18:          0          0   IO-APIC-level  usb-uhci
 19:          0          0   IO-APIC-level  usb-uhci
 20:   28205543   36703620   IO-APIC-level  aic7xxx
 23:          0          0   IO-APIC-level  ehci-hcd
NMI:          0          0 
LOC:   91137783   91137782 
ERR:          0
MIS:          0
           CPU0       CPU1       
  0:   50450581   52259862    IO-APIC-edge  timer
  1:         13         17    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          1          0    IO-APIC-edge  rtc
 12:        610        802    IO-APIC-edge  PS/2 Mouse
 14:    2308819    3072192    IO-APIC-edge  ide0
 15:     993140     979854    IO-APIC-edge  ide1
 16:   81799340          0   IO-APIC-level  usb-uhci, usb-uhci, eth0
 18:          0          0   IO-APIC-level  usb-uhci
 19:          0          0   IO-APIC-level  usb-uhci
 20:   28206393   36703775   IO-APIC-level  aic7xxx
 23:          0          0   IO-APIC-level  ehci-hcd
NMI:          0          0 
LOC:   91138265   91138263 
ERR:          0
MIS:          0
 (that took 5 seconds of wall time)


System 2, sometimes afflicted, but not always:
# lspci
00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub Interface
(rev 02)
00:01.0 PCI bridge: Intel Corp. 82865G/PE/P PCI to AGP Controller (rev 02)
00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#1 (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#2 (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02)
00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#4 (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
(rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
(rev 02)
00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) AC'97
Audio Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AQ [Radeon 9600]
01:00.1 Display controller: ATI Technologies Inc RV350 AQ [Radeon 9600] (Secondary)
03:0a.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 02)
03:0b.0 Ethernet controller: Intel Corp. 82541EI Gigabit Ethernet Controller
(Copper)
03:0c.0 RAID bus controller: Promise Technology, Inc. PDC20619 (FastTrak TX4000)
(rev 02)

#cat /proc/interrupts ; sleep 10; cat /proc/interrupts 
           CPU0       CPU1       
  0:   54441184   54436058    IO-APIC-edge  timer
  1:     320783     349490    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          1          0    IO-APIC-edge  rtc
 16:    9141098          0   IO-APIC-level  usb-uhci, usb-uhci, eth0
 17:        524        620   IO-APIC-level  Intel ICH5
 18:          0          0   IO-APIC-level  usb-uhci
 19:          0          0   IO-APIC-level  usb-uhci
 20:    1343037    1933515   IO-APIC-level  ft3xx
 22:         62          0   IO-APIC-level  sym53c8xx
 23:   28409010   29189783   IO-APIC-level  ehci-hcd
NMI:          0          0 
LOC:  108879749  108879777 
ERR:          0
MIS:          0
           CPU0       CPU1       
  0:   54441799   54436446    IO-APIC-edge  timer
  1:     320783     349490    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          1          0    IO-APIC-edge  rtc
 16:    9141133          0   IO-APIC-level  usb-uhci, usb-uhci, eth0
 17:        524        620   IO-APIC-level  Intel ICH5
 18:          0          0   IO-APIC-level  usb-uhci
 19:          0          0   IO-APIC-level  usb-uhci
 20:    1343039    1933522   IO-APIC-level  ft3xx
 22:         62          0   IO-APIC-level  sym53c8xx
 23:   28409297   29190214   IO-APIC-level  ehci-hcd
NMI:          0          0 
LOC:  108880751  108880779 
ERR:          0
MIS:          0
Comment 5 Alexandre Oliva 2005-08-26 09:55:51 EDT
FWIW, no_timer_check breaks USB2- and IEEE1394-connected hard drives for me. 
They come up ok after boot up, but after some time they stop responding.
Comment 6 linuxacct 2005-08-26 23:01:50 EDT
Interesting symptom: on 1 of the boxes, when the heavy scsi disk i/o finished,
the system clock too STOPPED altogether...  Actually, it's alternating between
two seconds:

[root@testbox root]# date
Fri Aug 26 20:11:02 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:02 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:02 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:01 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:01 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:01 EDT 2005
[root@testbox root]# date
Fri Aug 26 20:11:02 EDT 2005

Running cat /proc/interrupts shows that only "IO-APIC-level  usb-uhci, usb-uhci,
eth0" for CPU0 and "LOC:" for both CPUs is incrementing.  All other counters,
including IO-APIC-edge timer are not incrementing!

At this point, "reboot", "shutdown -r now", and even "ntpdate
tick.usno.navy.mil" stall forever (hang, but can be ^C).  The PC has been like
this for the past 2 hours.  I won't have physical access to the PC again until
Monday.

Not sure what else to try now; I'm a sysadmin, not a kernel hacker.
Comment 7 John Haxby 2005-08-29 10:56:03 EDT
The no_timer_check problem is specific to ATI chipset machines.   The problem
described in comment #4 and comment #6 would appear to be something rather
different.

The problem of USB and IEE1394 devices stopping working in comment #5 /might/ be
related, but it seems unlikely.

The triple speed clock on the R3000Z is a problem I came across while
researching this one.   It seems to be a different one: the clock running at
exactly twice the expected speed is simply a result of servicing each clock
interrupt twice.

I do hope that the no_timer_check fix will appear in RHEL4 U2.
Comment 8 Ken Roberts 2005-09-12 17:10:15 EDT
I have the same issue on RHEL 3, the clock advances 2.5 hours in 1.5 hours real
time.  I have a radeon graphics card, but the system clock only speeds up if I'm
using the SMP kernel.  I switch to the single-cpu kernel and the system clock
seems manageable by NTP.

My kernel and processor info:
%cat /proc/version
Linux version 2.4.21-27.ELsmp (bhcompile@bugs.build.redhat.com) (gcc version
3.2.3 20030502 (Red Hat Linux 3.2.3-47)) #1 SMP Wed Dec 1 21:59:02 EST 2004

%cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping        : 9
cpu MHz         : 2593.696
cache size      : 512 KB
physical id     : 0
siblings        : 2
runqueue        : 0
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 5177.34
                                                                               
                    
processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping        : 9
cpu MHz         : 2593.696
cache size      : 512 KB
physical id     : 0
siblings        : 2
runqueue        : 0
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 5177.34

Comment 9 Henrik Nordstrom 2006-02-01 21:12:24 EST
For additional info on this ATI problem see Bug #152170 and the upstream kernel
bug http://bugme.osdl.org/show_bug.cgi?id=4442 referenced from there.

There is apparently a patch being pushed to the mainline kernel to work around
this hardware issue by implementing the timer using the APIC alone, unrelated or
different to the no_timer_check option it seems.
Comment 10 Mark A. Baldridge 2006-02-07 13:26:20 EST
With the ATI Radeon XPress 200 video chipset (eMachines T6520 with jusb2
unplugged to make *that* work ;-) I see this at 2.6.15-1.1830-FC4, but the clock
ran normally at 2.6.14-1.1656-FC4.  I have also noticed another issue for which
I will go hunt:  Using the UniVerse database, (although I no longer get to see
the code) I do a series of lseek/fwrite to the same disk block about 23000
times.  This takes about 1 CPU and elapsed second, as reported in the 2.6.14
kernel.  In the 2.6.15 kernel, 8 of about 60 one-second bursts report 1.7 CPU
seconds and 20 to 60 elapsed seconds, (The elapsed seconds are high, obviously.)

cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 12
model name      : AMD Athlon(tm) 64 Processor 3400+
stepping        : 0
cpu MHz         : 2393.076
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow
bogomips        : 4795.40
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp
Comment 11 Eric Sandeen 2006-08-22 23:49:06 EDT
FWIW, this seems fixed for me in kernel-2.6.9-42 from the latest RHEL4 release.

I have a compaq presario 1710nx with the ATI chipset.
Comment 12 Jim Paradis 2006-08-23 17:44:35 EDT
I believe this one is a duplicate of Bug 173236, which was fixed for U4.

*** This bug has been marked as a duplicate of 173236 ***
Comment 13 John Haxby 2006-08-29 13:38:51 EDT
I can confirm that the original problem is fixed in the U4 kernel (2.6.9-42) and
that the fix in bug 172236 fixes the problem properly.
Comment 14 linuxacct 2006-09-28 18:55:34 EDT
(In reply to comment #12)
> I believe this one is a duplicate of Bug 173236, which was fixed for U4.
> 
> *** This bug has been marked as a duplicate of 173236 ***

I get "You are not authorized to access bug #173236." when trying to access this
bug, even when logged in. What's the problem?

Note You need to log in before you can comment on or make changes to this bug.