Bug 521517 - kvmclock sometimes jumps 4397 seconds into the future
Summary: kvmclock sometimes jumps 4397 seconds into the future
Keywords:
Status: CLOSED DUPLICATE of bug 531268
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Glauber Costa
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 528898
TreeView+ depends on / blocked
 
Reported: 2009-09-06 16:46 UTC by Rik van Riel
Modified: 2013-01-09 21:54 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-28 12:56:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
systemtap script to measure time skew in do_gettimeofday (574 bytes, text/plain)
2009-09-10 01:40 UTC, Rik van Riel
no flags Details
program to test for time going backwards (595 bytes, application/octet-stream)
2009-09-17 22:17 UTC, Rik van Riel
no flags Details

Description Rik van Riel 2009-09-06 16:46:11 UTC
Description of problem:

With a RHEL 5 single CPU guest running on top of Fedora 11 + virt-preview RPMs, sometimes gettimeofday in RHEL 5, with kvmclock clock source, jumps 4397 seconds into the future and then right back.

This backwards move ends up causing the tragic death of my dovecot IMAP daemon:

Sep  6 10:20:32 shelob dovecot: Time just moved backwards by 4397 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards

Version-Release number of selected component (if applicable):

kernel-2.6.18-165.el5

Steps to Reproduce:
1. run a RHEL 5 kernel with the -165 kernel on Fedora 11
2. watch the fireworks
3.
  
Additional info:

I found some additional info at these URLs.  It is unclear whether the bug fix would touch just virtio kvmclock (though I have never seen it on Xen PV guests) or whether it would touch more of the timer code...

http://www.dovecot.org/list/dovecot/2008-June/031548.html

http://lkml.org/lkml/2007/8/23/96

Comment 1 Glauber Costa 2009-09-08 13:33:41 UTC
Rik, is your VM a i386 or x86_64 VM ?

Comment 2 Rik van Riel 2009-09-08 14:34:55 UTC
x86-64

Comment 3 Rik van Riel 2009-09-08 18:18:53 UTC
Glauber, your test kernel reduces the problem to:

Sep  8 14:13:02 shelob dovecot: auth(default): Time just moved backwards by 5 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards
Sep  8 14:13:10 shelob dovecot: imap-login: Time just moved backwards by 4 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards
Sep  8 14:15:20 shelob dovecot: Time just moved backwards by 4 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards
Sep  8 14:15:48 shelob dovecot: Time just moved backwards by 5 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards

So we went from the occasional gigantic time jump to regular small time jumps.  Of course, I did boot my guest with 3 VCPUs :)

Comment 4 Rik van Riel 2009-09-08 18:22:43 UTC
Btw, it looks like 5 seconds is the dovecot threshold:

Sep  8 14:18:56 shelob dovecot: pop3-login: Time just moved backwards by 6 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards

Comment 5 Rik van Riel 2009-09-08 22:20:39 UTC
Glauber, I have bad news.


With your test kernel and echo 0 > /proc/sys/kernel/vsyscall64 I get small time jumps.

However, with kernel -165 (which does not have your patch) and echo 0 > /proc/sys/kernel/vsyscall64 I get no time jumps at all.  

Your patch seems to make things worse. I have not yet figured out why, though.

Comment 6 Rik van Riel 2009-09-08 23:36:16 UTC
Never mind that, I didn't wait long enough.  The first half hour of uptime had no time jumps, but now I'm seeing them in the -165 kernel, too.  It looks like your patch doesn't change things.

Sep  8 19:34:53 shelob dovecot: pop3-login: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards
Sep  8 19:34:53 shelob dovecot: pop3-login: Time just moved backwards by 2 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards

Comment 7 Rik van Riel 2009-09-10 01:32:31 UTC
Instrumenting do_gettimeofday (the one in arch/x86_64/kernel/time.c), keeping track of xtime.tv_sec, tv_sec in the returned tv and the last second count that was returned results in a lot of output like this:

do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 2
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 1
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 1
do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0

It looks like cpu0 is "behind" the other two, with cpu0 staying in sync with xtime and cpu1 and cpu2 getting larger and larger "corrections" from do_gettimeoffset_kvm().

static inline long do_gettimeoffset_kvm(void)
{
        return (kvm_clock_read() - vxtime.last_kvm);
}

That means either kvm_clock_read() is returning something wrong, or vxtime.last_kvm should be per cpu.  I suspect kvm_clock_read() is returning incorrect results.

Comment 8 Rik van Riel 2009-09-10 01:40:11 UTC
Created attachment 360397 [details]
systemtap script to measure time skew in do_gettimeofday

Comment 9 Rik van Riel 2009-09-10 14:49:05 UTC
In related news, after 16 hours of uptime:

1) vcpu0 is now 32 seconds behind the host clock

2) the time skew between vcpu0 and vcpu2 has increased to 12 seconds

Once the slashdot madness to kernelnewbies.org quiets down, I will reboot the host into a Fedora 2.6.30 kernel, to see if that makes the bug go away.  At this point it looks like it could definately be a host side bug, since kvmclock pretty much just gets the time from the host.

Can't rule anything out yet though - I will dig deeper.

Comment 10 Rik van Riel 2009-09-11 19:30:47 UTC
Upgrading the host to 2.6.30 did not fix the bug. Time for yet more instrumentation :)

Comment 11 Rik van Riel 2009-09-15 19:12:17 UTC
As requested:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4533.36
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 1
cpu cores       : 4
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 2
cpu cores       : 4
apicid          : 4
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 6
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 4
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.63
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 5
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 1
cpu cores       : 4
apicid          : 3
initial apicid  : 3
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 6
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 2
cpu cores       : 4
apicid          : 5
initial apicid  : 5
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 1596.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
bogomips        : 4532.64
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

Comment 12 Rik van Riel 2009-09-17 22:17:03 UTC
Created attachment 361574 [details]
program to test for time going backwards

With vsyscall gettimeofday this happens all the time, with echo 0 > /proc/sys/kernel/vsyscall64 it happens less often.

Comment 13 RHEL Program Management 2009-09-25 17:41:15 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 15 Dor Laor 2009-12-27 08:53:06 UTC
Can you please retest with latest 5.5 -183 kernel ?

Comment 17 Glauber Costa 2009-12-28 12:56:18 UTC

*** This bug has been marked as a duplicate of bug 531268 ***


Note You need to log in before you can comment on or make changes to this bug.