Description of problem: With a RHEL 5 single CPU guest running on top of Fedora 11 + virt-preview RPMs, sometimes gettimeofday in RHEL 5, with kvmclock clock source, jumps 4397 seconds into the future and then right back. This backwards move ends up causing the tragic death of my dovecot IMAP daemon: Sep 6 10:20:32 shelob dovecot: Time just moved backwards by 4397 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards Version-Release number of selected component (if applicable): kernel-2.6.18-165.el5 Steps to Reproduce: 1. run a RHEL 5 kernel with the -165 kernel on Fedora 11 2. watch the fireworks 3. Additional info: I found some additional info at these URLs. It is unclear whether the bug fix would touch just virtio kvmclock (though I have never seen it on Xen PV guests) or whether it would touch more of the timer code... http://www.dovecot.org/list/dovecot/2008-June/031548.html http://lkml.org/lkml/2007/8/23/96
Rik, is your VM a i386 or x86_64 VM ?
x86-64
Glauber, your test kernel reduces the problem to: Sep 8 14:13:02 shelob dovecot: auth(default): Time just moved backwards by 5 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Sep 8 14:13:10 shelob dovecot: imap-login: Time just moved backwards by 4 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Sep 8 14:15:20 shelob dovecot: Time just moved backwards by 4 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Sep 8 14:15:48 shelob dovecot: Time just moved backwards by 5 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards So we went from the occasional gigantic time jump to regular small time jumps. Of course, I did boot my guest with 3 VCPUs :)
Btw, it looks like 5 seconds is the dovecot threshold: Sep 8 14:18:56 shelob dovecot: pop3-login: Time just moved backwards by 6 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards
Glauber, I have bad news. With your test kernel and echo 0 > /proc/sys/kernel/vsyscall64 I get small time jumps. However, with kernel -165 (which does not have your patch) and echo 0 > /proc/sys/kernel/vsyscall64 I get no time jumps at all. Your patch seems to make things worse. I have not yet figured out why, though.
Never mind that, I didn't wait long enough. The first half hour of uptime had no time jumps, but now I'm seeing them in the -165 kernel, too. It looks like your patch doesn't change things. Sep 8 19:34:53 shelob dovecot: pop3-login: Time just moved backwards by 1 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards Sep 8 19:34:53 shelob dovecot: pop3-login: Time just moved backwards by 2 seconds. I'll sleep now until we're back in present. http://wiki.dovecot.org/TimeMovedBackwards
Instrumenting do_gettimeofday (the one in arch/x86_64/kernel/time.c), keeping track of xtime.tv_sec, tv_sec in the returned tv and the last second count that was returned results in a lot of output like this: do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 2 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 1 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546035, lastsecond = 1252546033, cpu = 1 do_gettimeofday: xtime.tv_sec = 1252546033, sec = 1252546033, lastsecond = 1252546035, cpu = 0 It looks like cpu0 is "behind" the other two, with cpu0 staying in sync with xtime and cpu1 and cpu2 getting larger and larger "corrections" from do_gettimeoffset_kvm(). static inline long do_gettimeoffset_kvm(void) { return (kvm_clock_read() - vxtime.last_kvm); } That means either kvm_clock_read() is returning something wrong, or vxtime.last_kvm should be per cpu. I suspect kvm_clock_read() is returning incorrect results.
Created attachment 360397 [details] systemtap script to measure time skew in do_gettimeofday
In related news, after 16 hours of uptime: 1) vcpu0 is now 32 seconds behind the host clock 2) the time skew between vcpu0 and vcpu2 has increased to 12 seconds Once the slashdot madness to kernelnewbies.org quiets down, I will reboot the host into a Fedora 2.6.30 kernel, to see if that makes the bug go away. At this point it looks like it could definately be a host side bug, since kvmclock pretty much just gets the time from the host. Can't rule anything out yet though - I will dig deeper.
Upgrading the host to 2.6.30 did not fix the bug. Time for yet more instrumentation :)
As requested: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4533.36 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.63 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4532.64 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
Created attachment 361574 [details] program to test for time going backwards With vsyscall gettimeofday this happens all the time, with echo 0 > /proc/sys/kernel/vsyscall64 it happens less often.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Can you please retest with latest 5.5 -183 kernel ?
*** This bug has been marked as a duplicate of bug 531268 ***