Bug 520685

Summary: use KVM pvclock code to detect/correct lost ticks [rhel-5.4.z]
Product: Red Hat Enterprise Linux 5 Reporter: Benjamin Kahn <bkahn>
Component: kernelAssignee: Jiri Pirko <jpirko>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.4CC: bfox, dhoward, dzickus, gcosta, jpirko, lihuang, llim, michen, ovirt-maint, pm-eus, rdoty, riek, rkhan, rlerch, sburgess, sgrinber, tao, tburke, vanhoof
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.18-164.1.1.el5 Doc Type: Bug Fix
Doc Text:
The KVM pvclock code provides a stable source of timing for KVM guests that supports it. However, there are rare occasions where the kvmclock does not function well when the physical cpu does not support constant_tsc and the guest is i386 (32 bit), causing the kvmclock to skew. To work around this issue, view the Red Hat Knowlegebase article located at <URL to be advised>
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-29 19:33:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 476075    
Bug Blocks: 518493    

Description Benjamin Kahn 2009-09-01 19:41:30 UTC
This bug has been copied from bug #476075 and has been proposed
to be backported to 5.4 z-stream (EUS).

Comment 3 Jiri Pirko 2009-09-08 11:08:22 UTC
in kernel-2.6.18-164.1.1.el5

Comment 13 lihuang 2009-09-25 06:02:24 UTC
Test Reboot on AMD Hosts:

  After rebooting guest. guest is <0.1s faster than host.
  i686   : PASS ( offset is -0.569655 , -0.569714 , -0.569160 )
 x86_64  : PASS ( offset is -0.466835 , -0.587773 , -0.675310 )

raw log is here :
  http://focus.bne.redhat.com/~lihuang/timekeeping/amd-pv-i686-reboot.txt
  http://focus.bne.redhat.com/~lihuang/timekeeping/amd-pv-x86_64-reboot.txt

Comment 15 lihuang 2009-09-25 06:24:45 UTC
======  Test summary  ======
 1 . time drift after boot : 
   (1~2s later than host after starting up, it is a known issue bug 523478 )
 2 . time drift after migration :
   PASS
 3 . time drift after reboot :
   PASS
 4 . Measure time drift in 12 hours.
   i686 : PASS
 x86_64 : AMD - offset increase  0.1 sec every 10min
        Intel - offset increase 0.01 sec every 10min
============================

Comment 22 errata-xmlrpc 2009-09-29 19:33:54 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1455.html

Comment 23 Ryan Lerch 2009-09-29 22:57:42 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
The KVM pvclock code provides a stable source of timing for KVM guests. pvclock does not function on machines that lack the constant_tsc flag. However, some machines that do have the constant_tsc flag may not have a stable Time Stamp Counter (TSC), causing the kvmclock to skew. To work around this issue, boot the kernel with the parameter "clock=pmtimer".

Comment 25 Ryan Lerch 2009-09-29 23:41:44 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-The KVM pvclock code provides a stable source of timing for KVM guests. pvclock does not function on machines that lack the constant_tsc flag. However, some machines that do have the constant_tsc flag may not have a stable Time Stamp Counter (TSC), causing the kvmclock to skew. To work around this issue, boot the kernel with the parameter "clock=pmtimer".+The KVM pvclock code provides a stable source of timing for KVM guests. However, on x86_64 based machines that have the constant_tsc flag, the Time Stamp Counter (TSC) may not be stable, causing the kvmclock to skew. To work around this issue, view the Red Hat Knowlegebase article located at <URL to be advised>

Comment 26 Dor Laor 2009-09-30 09:17:40 UTC
Here is a summary of what we agreed upon:
 + General

   - When the physical cpu on the host has stable_tsc capability (all
     new host, > AMD rev-F) it should work well - some issues below.

   - There is a workaround of the guest kernel cmdline to avoid using
     pv clock: http://cleo.tlv.redhat.com/qumrawiki/KVM/TimeKeeping
     It was validated by QE.

 + Problems

   - Live migration: We need to make sure that the guest won't notice
     the clock goes back. We have agreed upon a solution. Glauber
     working on patches.

   - i386 rhel5.4 guest
     Works fine!

   - 64 bit rhel5.4 guest
     There is a potential of cavitiy with non stable_tsc hosts.
     Details: last_tsc variable in the guest is too hard to keep. As
     long as the tsc on the host (cpu) is stable, there won't be
     surprises.

+ Recommendations:

  - Option 1: Cancel pv clock in rhel5.4 guest
     I don't recommend it since it does the job and we can always
     cancel it. The only breakage danger is guest time which might
     happen without the clock.

   - Option 2: Allow it only on stable_tsc hosts.
     It needs migration to be fixed.
     Most logical.

Option 2 was chosen.

Comment 27 Dor Laor 2009-09-30 09:34:16 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-The KVM pvclock code provides a stable source of timing for KVM guests. However, on x86_64 based machines that have the constant_tsc flag, the Time Stamp Counter (TSC) may not be stable, causing the kvmclock to skew. To work around this issue, view the Red Hat Knowlegebase article located at <URL to be advised>+The KVM pvclock code provides a stable source of timing for KVM guests that supports it. However, there are rare occasions where the kvmclock does not function well when the physical cpu does not support constant_tsc and the guest is i386 (32 bit), causing the kvmclock to skew. To work around this issue, view the Red Hat Knowlegebase article located at <URL to be advised>

Comment 28 Susan Burgess 2009-11-18 06:07:36 UTC
Please advise the KB article

Comment 29 Dor Laor 2009-11-23 10:16:07 UTC
(In reply to comment #28)
> Please advise the KB article  

It should be based on the above wiki and VMWare's recommendations.
Simon Grinberg is the one to write it.