Bug 655990 - clock drift when migrating a guest between mis-matched CPU clock speed
clock drift when migrating a guest between mis-matched CPU clock speed
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.5.z
Unspecified Unspecified
urgent Severity urgent
: rc
: ---
Assigned To: Glauber Costa
Virtualization Bugs
: ZStream
: 633693 645308 (view as bug list)
Depends On:
Blocks: Rhel5KvmTier2 660239
  Show dependency treegraph
 
Reported: 2010-11-22 14:58 EST by Douglas Schilling Landgraf
Modified: 2013-01-21 03:25 EST (History)
15 users (show)

See Also:
Fixed In Version: kvm-83-220.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-13 18:38:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
dmesg output from guest that experiences time drift (15.84 KB, application/octet-stream)
2010-11-24 11:55 EST, Heath Petty
no flags Details

  None (edit)
Description Douglas Schilling Landgraf 2010-11-22 14:58:41 EST
Description of problem:

In RHEL 5.5 64 bit (latest errata kernel), when using the kvm clock, if that machine is live migrated to another machine with a different CPU clock speed, the clock will consistently lose or gain time. Ntpd is not able to keep the system time correctly.

How reproducible:

In RHEVM define a RHEL 5.5 virtual machine using the latest 5.5 errata kernel. 

Sanity check to see if the clock is stable:

# ntpdate -qb clock.redhat.com

Live migrate the virtual machine to another host that has a faster clock speed.
After migration observe that the clock is consistently losing time. I.e. the output of the above ntpdate command decrements consistently. 

Additional info:

A reproducer is setup on rhev3-m.gsslab.rdu.redhat.com (RHEVM machine) for investigation.

On guests that do not use the kvm clock, and that have the recommended kernel boot options configured [1], ntpd can easily correct any small drift caused by the migration, however, with the pvclock the time quickly and consistenly changes after migration to another host. Once the guest is migrated back to the original host the clock stops drifting. The drifting seeing is approximatly half a second every second the host is running. 

[1]
http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Virtualization_for_Desktops/2.2/html/Administration_Guide/chap-Virtualization-KVM_guest_timing_management.html
Comment 2 Glauber Costa 2010-11-23 11:24:14 EST
Zach, can you take a pre-look at this to see if there is any tsc related fixes that might help here?

Meanwhile, can the reporter please provide cpuinfo for both hosts?
Comment 4 Zachary Amsden 2010-11-23 11:52:54 EST
Please provide evidence of the clocksource being kvm clock in the guest.  Also, please provide the EXACT kernel version being used, all kernel command line parameters and the qemu command line being used.

In my estimation, kvm clock should not cause drift like this on migration, the clock should only drift if TSC is being used as a time source.  It is possible that lost tick compensation is being done based on TSC, even with kvm clock, and this causes drift.  This is currently my only theory on a plausible explanation for drift.

If this turns out to be true, that kvm clock is drifting after migration, we have a serious problem which we have not yet addressed with any upstream patches.  In that case, there is no way this can be fixed in 5.5 or 5.6 - the development to solve the problem has not even been upstreamed yet.
Comment 6 Zachary Amsden 2010-11-23 21:04:53 EST
You are not using kvm clock.  Therefore this is the expected behavior.  Fixes to make TSC timesource stable across migration with different host CPU speeds will not be available until RHEL 6.1.
Comment 7 Glauber Costa 2010-11-24 08:12:35 EST
Zach, I think he is.

Note the time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer.

For x86_64 RHEL6, this is usually an indication that kvmclock is on.

Can we see your cpuinfo for the guests?
Comment 9 Glauber Costa 2010-11-24 11:47:39 EST
dmesg would also be helpful, btw.

sorry for not requesting before.
Comment 10 Zachary Amsden 2010-11-24 11:48:10 EST
The cpuinfo is never updated upon migration.  It is read once, at boot and never changed.
Comment 12 Heath Petty 2010-11-24 11:55:58 EST
Created attachment 462697 [details]
dmesg output from guest that experiences time drift
Comment 14 Glauber Costa 2010-11-25 09:07:00 EST
Guys,

is this reproduceable with a -smp 1 VM ?

If it is related to TSC frequencies alone, it should happen on UP as well.
It would also heavily simplify the assumptions we have to make here.
Comment 22 Douglas Schilling Landgraf 2010-12-01 11:24:30 EST
Hello Dor,

   Let me us know if you need additional test or data.

Thanks
Douglas
Comment 23 Dor Laor 2010-12-01 13:29:45 EST
Is that RHEV or RHEL host?
What happens if your turn on the cpuspeed service on the host and force a frequency change on the host (without migrating) - does the guest time drifts?
Is it possible to get access to the machines for debug?
Comment 25 Zachary Amsden 2010-12-01 15:15:14 EST
(In reply to comment #23)
> Is that RHEV or RHEL host?
> What happens if your turn on the cpuspeed service on the host and force a
> frequency change on the host (without migrating) - does the guest time drifts?
> Is it possible to get access to the machines for debug?

cpuspeed changes are compensated for by kvmclock, as we receive notification before and after the CPU frequency change.

This same compensation should take place after migration, however, it is possible that we don't reset the kvmclock data.

Is the hardware Intel?  Yes, I see it is.  This would only happen on Intel hardware...

What is the host kernel / kvm version?  I don't see it in the data here.
Comment 28 Zachary Amsden 2010-12-01 18:35:34 EST
How does one log in to these machines?  I have no idea how to access the gss servers.
Comment 30 Zachary Amsden 2010-12-01 19:24:23 EST
Note that the migration problem Dor was referring to is long since fixed in the hypervisor versions they are using.

https://bugzilla.redhat.com/show_bug.cgi?id=531701

As such, we have no way to explain the bug.  The kvmclock should be set on migration to the new host, and should compute a new time scale if tsc_khz is different there.
Comment 31 Zachary Amsden 2010-12-01 19:48:56 EST
These documents don't help me very much at all.  I've never used remote lab machines before.  So far everything I try fails.

I can't log into PDU, connection to IBM Blade center management page times out, and I don't know the root password.

Can someone PM these to me over irc?
Comment 32 Zachary Amsden 2010-12-01 20:27:51 EST
I will need to do process traces on the hosts and maybe also run custom qemu-kvm binaries to debug, bypassing rhevm interfaces.  Is there any way to log into these hypervisors as root to do this?
Comment 35 Dan Kenigsberg 2010-12-02 06:06:43 EST
list vdsm VMs known to vdsm:

vdsClient -s 0 list table

send monitor command

vdsClient -s 0 list monitorCommand <vmId> <command>

no need to take vdsm down. Vm would just show up "nonresponsive" if it is gdb'ed.
Comment 37 Zachary Amsden 2010-12-02 15:36:44 EST
Unfortunately, given how recent their kvm package is, there was no relevant difference between that and current RHEL 5 code.  If there is a bug, this means it is probably still in RHEL 5.

I have had difficulty formulating a good theory for this bug.  Brainstorming has come up with these possibilities:

1) Failure to register KVM clock page after migration (qemu-kvm)
2) Missing KVM clock data in migration protocol (qemu-kvm)
3) Erroneous use of tsc_khz variable in guest kernel (kernel, guest)
4) Broken lost tick accounting (kernel, guest)
5) Failure to update KVM clock page (kvm module)

None of these seem spectacularly likely, as 1, 2, and 5 should throw the clock completely off (TSC of remote migration host has different base), and 3, 4 would represent a bug which still exists in RHEL5, AFAICT.

I'm still curious and very puzzled about this in the original report, perhaps it can shed some light.

>time.c: can't update CMOS clock from 0 to 58
Comment 38 Zachary Amsden 2010-12-02 15:53:40 EST
Um, what is this?  This looks 100% wrong.

        if (use_kvm_time) {
                timename = "KVM";
                /* no need to get frequency here, since we'll skip the calibrate loop anyway */
                timekeeping_use_tsc = 1;
Comment 41 Zachary Amsden 2010-12-02 16:42:20 EST
I considered that, but I don't have a RHEL5 host which can change TSC frequency, my RHEL5 host is fixed frequency.  I also had enough headaches creating a dual boot RHEL5 / 6 configuration (this is where SELinux loses, badly) that I don't really want to lose an entire day reformatting, partitioning, and reconfiguring my laptop.

Based on Glauber's and Douglas' results, it looks like this is a qemu-kvm bug (and one already known and fixed).

My earlier avenues of exploration have turned up dead, looks like cycles_per_tick and account_tsc_cycles are just badly named, but actually do accounting in nanosecond time.

                if (use_kvm_time) /* KVM time is already in nanoseconds units */
                        cycles_per_tick = 1000000000 / REAL_HZ;
Comment 42 Dor Laor 2010-12-02 17:29:16 EST
(In reply to comment #41)
> I considered that, but I don't have a RHEL5 host which can change TSC
> frequency, my RHEL5 host is fixed frequency.  I also had enough headaches

I was under the impression that cpuspeed service can change it.
Comment 43 Glauber Costa 2010-12-02 18:42:29 EST
Dor, frequency changes has nothing to do with it.
The problem here is related to save/load functions.

I am expecting to post a patch soon.
Comment 45 Zachary Amsden 2010-12-02 20:54:50 EST
(In reply to comment #42)
> (In reply to comment #41)
> > I considered that, but I don't have a RHEL5 host which can change TSC
> > frequency, my RHEL5 host is fixed frequency.  I also had enough headaches
> 
> I was under the impression that cpuspeed service can change it.

Only if the host CPU supports that.  AMD PowerNow! CPUs change TSC frequency with CPU frequency, but most Intel SpeedStep capable CPUs and all Pentium-4 based CPUs do not change TSC frequency with CPU frequency.

As usual, TSC complicates everything by always adding exceptions to every rule.
Comment 52 Mike Cao 2010-12-06 05:27:32 EST
Reproduced on kvm-83-219.el5

steps
1:start VM in src host:/usr/libexec/qemu-kvm -no-hpet -no-kvm-pit-reinjection -usbdevice tablet -rtc-td-hack -startdate now -name RHEL5_5_64bit -smp 1,cores=1 -k en-us -m 512 -boot cd -net nic,vlan=1,macaddr=00:1a:4a:0a:39:12,model=virtio -net tap,vlan=1,ifname=virtio_11_1,script=/etc/qemu-ifup -drive file=/mnt/RHEL5u5_32.qcow,media=disk,if=virtio,cache=off,serial=cb-b009-0b8e478a4c9d,boot=on,format=qcow2,werror=stop -soundhw ac97 -vnc :10 -cpu qemu64,-nx,+sse2 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=RedHat,version=5.5-2.2-7.3,serial=2665D320-17FF-35C3-B99C-6242C3C98CF8_00:1a:64:21:75:2a,uuid=e8e261ba-96c2-4393-962c-97b84625be6d -monitor stdio
2. start listenning port <commandLine> -incoming tcp:0:6888
3. live migration 

Actual Results
 #ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -58.286086, delay 0.28876
 6 Dec 18:27:21 ntpdate[3406]: step time server 66.187.233.4 offset -58.286086  sec
 # ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -58.414945, delay 0.29927
 6 Dec 18:27:24 ntpdate[3407]: step time server 66.187.233.4 offset -58.414945 sec
# ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -58.566882, delay 0.29752
 6 Dec 18:27:26 ntpdate[3408]: step time server 66.187.233.4 offset -58.566882 sec
# ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -58.920349, delay 0.29062
 6 Dec 18:27:32 ntpdate[3409]: step time server 66.187.233.4 offset -58.920349 sec



Waiting for new patch to verify it.
Comment 65 Mike Cao 2010-12-07 23:59:34 EST
Verified this issue on kvm-83-219.el5 on intel hosts.

Repeated steps in comment #52.

Actual Results:
no clock drift occurs.
this issue has been fixed on intel host
Comment 66 Mike Cao 2010-12-09 01:10:17 EST
Verified this issue on kvm-83-221.el5 on AMD hosts.

Repeated steps in comment #52,

Acutal Results:
no clock drift occurs.

Base on above ,this issue has been fixed.
Comment 69 errata-xmlrpc 2011-01-13 18:38:26 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html
Comment 70 Glauber Costa 2011-01-18 04:56:30 EST
*** Bug 645308 has been marked as a duplicate of this bug. ***
Comment 71 Glauber Costa 2011-01-18 05:04:53 EST
*** Bug 633693 has been marked as a duplicate of this bug. ***
Comment 72 dmartin 2011-07-05 18:26:20 EDT
I am still seeing this behavior on RHEL 5.6, kernel 2.6.18-238.9.1 and KVM 224. 

time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer.

model name      : Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz
stepping        : 2
cpu MHz         : 3059.095

model name      : Intel(R) Xeon(R) CPU           X5680  @ 3.33GHz
stepping        : 2
cpu MHz         : 3325.151

/usr/libexec/qemu-kvm -S -M rhel5.4.0 -m 8192 -smp 4,sockets=4,cores=1,threads=1 -name vcorp200l -uuid 97c7c55f-9c0a-ed45-6aea-cfbe2fb41099 -monitor unix:/var/lib/libvirt/qemu/vcorp200l.monitor,server,nowait -no-kvm-pit-reinjection -boot c -drive file=/dev/iscsi_linux/vcorp200l-0,if=virtio,boot=on,format=raw,cache=none -net nic,macaddr=54:52:00:7a:12:00,vlan=0,model=virtio -net tap,fd=17,vlan=0 -net nic,macaddr=52:54:00:8d:73:76,vlan=1,model=virtio -net tap,fd=21,vlan=1 -serial pty -parallel none -usb -vnc 127.0.0.1:3 -k en-us -vga cirrus -incoming tcp:0.0.0.0:49157 -balloon virtio
Comment 73 Mike Cao 2011-07-05 22:25:00 EDT
(In reply to comment #72)
> I am still seeing this behavior on RHEL 5.6, kernel 2.6.18-238.9.1 and KVM 224. 
> 
> time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer.
> 
Hello dmartin

From you comment the clocksource of the guest is kvm_clock ,while this bug is fix the issue when guest's clocksource is TSC.

Could you use -M rhel5.6.0 instead of -M rhel5.4.0 to try again ?If still can reproduce ,I prefer you can report a new bug .

Best Regards,
Mike
Comment 74 dmartin 2011-07-05 23:25:13 EDT
(In reply to comment #73)
> Could you use -M rhel5.6.0 instead of -M rhel5.4.0 to try again ?If still can
> reproduce ,I prefer you can report a new bug .

-M 5.6.0 seemed to fix the problem, thank you.  Is there any documentation on the difference between the different options (rhel5.4.0, 5.4.4, 5.5.0, 5.6.0 etc)?,  I cannot find anything on this subject.
Comment 75 Dor Laor 2011-07-07 08:33:23 EDT
For documentation of the machine type (virtual hardware compatibility) we use the release notes. For some live migration features such as the kvmclock we can't fix a general propose live migration protocol and must bump the version using the machine type.

Note You need to log in before you can comment on or make changes to this bug.