Bug 821988

Summary: Saved VMs have wrong time on restore
Product: Red Hat Enterprise Linux 6 Reporter: German Pulido <g-pulido>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED CANTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: acathrow, dallan, dyasny, dyuan, eblake, g-pulido, michen, mzhan, rabbit+bugs, rwu, toracat, wnefal+redhatbugzilla
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-01 12:33:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description German Pulido 2012-05-16 02:51:28 UTC
If a Virtual Machine running under KVM on a CentOS 6.2 host is saved, and then is restored after a few hours, the time in the VM will not sync with the real time, it will continue from the moment it was suspended. The VM uses kvm-clock as time source (recommended according to docs I've found). Even using NTP will not help because the time shift is too big for NTP to fix it.

Reported to CentOS bug tracker: http://bugs.centos.org/view.php?id=5726

Comment 3 Jiri Denemark 2012-06-01 12:32:51 UTC
Unfortunately, we can't do much about it because nothing happened from the guest OS' point of view. It has no idea it was suspended and resumed and we have no way of telling this to the guest OS. Therefore I'm closing the bug as CANTFIX.

However, it might be something that qemu guest agent could help with.

Comment 4 Peter Rabbitson 2012-06-19 12:23:26 UTC
This can not be right. If there was "no way of telling" things to the guest OS, then a mere pause would desync the clock as well. However it does not happen. How come the same mechanism that resets the clock after a suspend, can not be employed to reset the clock after a managed save? It seems like a genuine bug to me.

Consider:

rabbit@Dungeon:~$ clockdiff 192.168.58.165
.....
host=192.168.58.165 rtt=237(295)ms/0ms delta=-1ms/-1ms Tue Jun 19 08:16:36 2012

rabbit@Dungeon:~$ si virsh suspend ws_lin_dev
Domain ws_lin_dev suspended

rabbit@Dungeon:~$ clockdiff 192.168.58.165
192.168.58.165 is down

rabbit@Dungeon:~$ si virsh resume ws_lin_dev
Domain ws_lin_dev resumed

rabbit@Dungeon:~$ clockdiff 192.168.58.165
..
host=192.168.58.165 rtt=562(280)ms/0ms delta=-1ms/-1ms Tue Jun 19 08:17:52 2012

rabbit@Dungeon:~$ si virsh managedsave ws_lin_dev
Domain ws_lin_dev state saved by libvirt

rabbit@Dungeon:~$ clockdiff 192.168.58.165
192.168.58.165 is down

rabbit@Dungeon:~$ si virsh start ws_lin_dev
Domain ws_lin_dev started

rabbit@Dungeon:~$ clockdiff 192.168.58.165
..
host=192.168.58.165 rtt=562(280)ms/0ms delta=-57417ms/-57417ms Tue Jun 19 08:19:16 2012
# ^^ OUCH - 57 secs

rabbit@Dungeon:~$ si virsh suspend ws_lin_dev
Domain ws_lin_dev suspended

rabbit@Dungeon:~$ sleep 20
# sleep a while while paused

rabbit@Dungeon:~$ si virsh resume ws_lin_dev
Domain ws_lin_dev resumed

rabbit@Dungeon:~$ clockdiff 192.168.58.165
.
host=192.168.58.165 rtt=750(187)ms/0ms delta=-57417ms/-57417ms Tue Jun 19 08:20:00 2012
# difference exact down to the millisecond after 20 secs of paused sleep

Comment 5 Peter Rabbitson 2012-06-19 13:27:25 UTC
Investigating further I found this warnocked patch, which looks like exactly the thing in question. Please reopen this ticket so that there is a placeholder of when this fix will land properly.

http://www.mail-archive.com/kvm@vger.kernel.org/msg67881.html

Comment 6 Akemi Yagi 2012-07-31 21:26:33 UTC
(In reply to comment #5)
> Investigating further I found this warnocked patch, which looks like exactly
> the thing in question. Please reopen this ticket so that there is a
> placeholder of when this fix will land properly.
> 
> http://www.mail-archive.com/kvm@vger.kernel.org/msg67881.html

(For the record)
Looks like that patch (by Marcelo Tosatti) appeared in bugzilla #694801 and has been applied to the RHEL 6.3 GA kernel.

Comment 7 Eric Blake 2014-06-18 12:51:33 UTC
This is still a frequently-requested issue on upstream lists; qemu 2.0 made it possible for the guest-agent to do guest-set-time without arguments to have the guest re-read the hardware clock and adjust software time from that, and qemu 2.1 is adding rtc-reset-reinjection to tell qemu when the agent has been used to force guest time and therefore qemu no longer needs to slew clock interrupts.  Libvirt 1.2.5 added a virDomainSetTime API to trigger the guest agent command, and future libvirt versions may add a flag to that API to also trigger the rtc-reset-reinjection followup.  There is also an idea of adding an <on_resume> action to the domain XML of libvirt to allow libvirt to automatically trigger time reset on resume (it can't be on by default, because it requires guest interaction which is only safe if the guest is trusted; but could be explicitly enabled by management as needed).  Of course, as this is still under active work upstream, there is no telling how soon it can be backported downstream, or even if such a backport is feasible without a rebase.