Bug 1264258

Summary: Guest's time stops with option clock=vm when guest is paused
Product: Red Hat Enterprise Linux 7 Reporter: Xujun Ma <xuma>
Component: qemu-kvm-rhevAssignee: Laurent Vivier <lvivier>
Status: CLOSED ERRATA QA Contact: Xujun Ma <xuma>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.2CC: dgibson, huding, juzhang, knoel, lvivier, michen, mrezanin, mtosatti, ngu, qzhang, virt-maint, xfu, xuma, zhengtli
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.8.0-4.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:29:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Xujun Ma 2015-09-18 03:24:44 UTC
Description of problem:
Guest's time doesn't stop with option clock=vm when guest is paused

Version-Release number of selected component (if applicable):
host:
kernel-3.10.0-313.el7.ppc64le
qemu-kvm-rhev-2.3.0-23.el7.ppc64le
SLOF-20150313-4.gitc89b0df.el7.noarch
guest:
kernel-3.10.0-313.el7.ppc64le

How reproducible:
100%

Steps to Reproduce:
1.Boot up a guest with option clock=vm
drive_path=img
iso_path=iso/RHEL-7.2-20150904.0-Server-ppc64le-dvd1.iso
/usr/libexec/qemu-kvm \
 -name xuma-test \
 -smp 4 \
 -m 1024 \
 -monitor stdio \
 -rtc base=utc,clock=vm \
 -no-shutdown \
 -boot strict=on \
 -vnc 0:99 \
 -qmp tcp:0:9999,server,nowait \
 -usbdevice tablet \
\
 -device virtio-scsi-pci,bus=pci.0,addr=0x5 \
\
 -device scsi-hd,id=scsi-hd0,drive=scsi-hd0-dr,bootindex=0 \
 -drive file=minimal.qcow2,if=none,id=scsi-hd0-dr,format=qcow2,cache=none \
\
 -device scsi-hd,id=scsi-hd1,drive=scsi-hd1-dr,bootindex=2 \
 -drive file=share.qcow2,if=none,id=scsi-hd1-dr,format=qcow2,cache=none \
\
 -device scsi-cd,id=scsi-cd1,drive=scsi-cd1-dr,bootindex=1 \
 -drive file=$iso_path,if=none,id=scsi-cd1-dr,readonly=on,format=raw,cache=none \
\
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:88 \
 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \
\
 -device virtio-serial,id=virtio-serial0 \
 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 \
 -device virtserialport,bus=virtio-serial0.0,chardev=qga0,id=qemu-ga0,name=org.qemu.guest_agent.0 \

2.check time in  guest 
#date

3.stop guest immediately  in hmp then wait a minute
(qemu)stop

4.continue guest in hmp then show time in guest
(qemu)c
#date

Actual results:
Guest's time doesn't stop when guest is  paused.

Expected results:
Guest's time shoud stop  when guest is paused

Additional info:
The issue doesn't appear on x86 platform

Comment 2 David Gibson 2015-09-18 03:45:12 UTC
Deferring to 7.3

Comment 3 Gu Nini 2015-09-28 10:27:11 UTC
For my test on following host/guest sw versions, the guest hwclock really stops WHILE the guest system time does not stop as that described in the bug.

Host kernel: 3.10.0-316.el7.ppc64le
Guest kernel: 3.10.0-316.el7.ppc64
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-25.el7.ppc64le

Comment 4 Laurent Vivier 2015-09-28 11:04:17 UTC
Just a note:

This happens because date seems to use the CPU Time Base Register which is not stopped (it is copied directly from the host).

hwclock works correctly, as it uses the RTAS call "get-time-of-day". "date" on a TCG mode guest works too as the TBR is emulated.

Comment 5 David Gibson 2015-09-29 04:22:32 UTC
Ah, that's a good point.  Technically the qemu option only affects the RTC which is already behaving correctly.

So, the question is, is there a related option that will affect runtime time sources like the timebase, and if so how do we implement it.  I think we already have some support for virtualizing the timebase in the kernel (to handle migration and some other edge cases).

Comment 6 David Gibson 2015-11-05 03:40:00 UTC
Xujun Ma,

Can you confirm that on x86 the guest time _does_ stop when the guest is paused?  Does this change when clock=host is supplied?

Comment 7 Xujun Ma 2015-11-06 07:37:54 UTC
(In reply to David Gibson from comment #6)
> Xujun Ma,
> 
> Can you confirm that on x86 the guest time _does_ stop when the guest is
> paused?  
the time of x86 guest can stop when guest is paused with option "rtc base=utc,clock=vm "

Does this change when clock=host is supplied?
the time of x86 guest can stop when guest is paused with option "rtc base=utc,clock=host "

Comment 8 David Gibson 2015-11-08 23:44:35 UTC
Xujun Ma,

Thanks for the clarifications.

 * The -rtc options don't make a difference here - that makes sense, since we're discussing the system time (derived from timebase) rather than the real time clock.

Our behaviour is different from x86, but TBH, I think the ppc64 behaviour is more correct - system time continues to track wall-clock time, even when the guest is paused.

Unless someone has a compelling reason that stopping the clock during pause is a good idea, I'm inclined to close this as NOTABUG.

Comment 9 Qunfang Zhang 2015-11-11 05:53:00 UTC
(In reply to David Gibson from comment #8)
> Xujun Ma,
> 
> Thanks for the clarifications.
> 
>  * The -rtc options don't make a difference here - that makes sense, since
> we're discussing the system time (derived from timebase) rather than the
> real time clock.
> 
> Our behaviour is different from x86, but TBH, I think the ppc64 behaviour is
> more correct - system time continues to track wall-clock time, even when the
> guest is paused.
> 
> Unless someone has a compelling reason that stopping the clock during pause
> is a good idea, I'm inclined to close this as NOTABUG.

Based on above comment, I plan to change the hardware to x86_64 and involve the developer in x86 side to have a look. Since the current behaviour are different between the two platform.

Comment 10 David Gibson 2015-11-12 05:41:48 UTC
Qunfang,

The system timekeeping mechanisms are different between x86 and ppc64, so it may not be feasibly to have the same behaviour here.  Still, it's at worth passing over to the x86 people for an assessment.  I'm reassigning to default to get their attention.

Comment 11 Qunfang Zhang 2015-11-12 07:57:19 UTC
(In reply to David Gibson from comment #10)
> Qunfang,
> 
> The system timekeeping mechanisms are different between x86 and ppc64, so it
> may not be feasibly to have the same behaviour here.  Still, it's at worth
> passing over to the x86 people for an assessment.  I'm reassigning to
> default to get their attention.

David,

Agree with you, will wait for x86 developer to have a look and confirm before we close it.

Comment 12 Marcelo Tosatti 2016-06-29 02:30:48 UTC
(In reply to David Gibson from comment #8)
> Xujun Ma,
> 
> Thanks for the clarifications.
> 
>  * The -rtc options don't make a difference here - that makes sense, since
> we're discussing the system time (derived from timebase) rather than the
> real time clock.
> 
> Our behaviour is different from x86, but TBH, I think the ppc64 behaviour is
> more correct - system time continues to track wall-clock time, even when the
> guest is paused.
> 
> Unless someone has a compelling reason that stopping the clock during pause
> is a good idea, I'm inclined to close this as NOTABUG.

https://patchwork.ozlabs.org/patch/252455/

kvmclock should not count while vm is paused, because:

1) if the vm is paused for long periods, timekeeping
math can overflow while converting the (large) clocksource
delta to nanoseconds.

2) Users rely on CLOCK_MONOTONIC to count run time, that is,
time which OS has been in a runnable state (see CLOCK_BOOTTIME).

Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.

--------------


So if PowerPC timekeeping code can overflow after reading a large delta 
from a clocksource, then counting of the clock should stop.

Comment 13 Laurent Vivier 2016-08-30 13:53:39 UTC
Deferring to 7.4

Comment 14 Laurent Vivier 2017-01-06 14:16:27 UTC
David, do you think we need to implement on POWER the solution described in comment #12 ?

Comment 15 David Gibson 2017-01-12 01:09:27 UTC
Laurent,

I discussed this with Paulus today.  We're not sure if (1) applies for powerpc, but (2) would seem to apply universally.

So, yes, I think we do need a fix similar to the kvmclock one for Power.  Specifically, after the pause, we'll need to recalculate the correct offset between guest and host timebase to result in ~0 time change in the guest, that offset can then be set into KVM with the SET_ONE_REG interface.

We should be able to adapt the code used to set the TB offset on an incoming migration for this purpose.

What I'm less clear on is how we distinguish between a "real", user requested pause that could last indefinitely and a pause due to other activity (e.g. migration downtime, or something slow that needs to be processed in qemu).  The latter should not pause the clock, obviously, or the guest clock will drift unpredictably.  I'm hoping the kvmclock patch Marcelo points to will provide some clues for this.

Comment 16 David Gibson 2017-01-12 01:12:32 UTC
Btw, best be careful that the behaviour remains consistent between KVM and TCG guests.

For KVM we need to explicitly update a KVM pseudo-register which controls the delta between guest and host timebase (the delta is applied to the real CPU timebase register at guest entry/exit).  For TCG, IIRC, the timebase value is computed at mftb() time.  So for that we'd need to make sure it's computed according to the VM-relative clock rather than the host-relative clock.

Comment 17 Laurent Vivier 2017-01-31 13:53:12 UTC
I've sent a patch upstream, trying to port i386 patch (from comment #12):
http://patchwork.ozlabs.org/patch/720656/

Comment 18 Miroslav Rezanina 2017-02-10 13:56:25 UTC
Fix included in qemu-kvm-rhev-2.8.0-4.el7

Comment 20 Xujun Ma 2017-04-24 02:15:05 UTC
Reproduce this issue with old build:
qemu-kvm-rhev-2.3.0-23.el7.ppc64le
guest:3.10.0-655.el7.test.ppc64le
host:3.10.0-653.el7.ppc64le


Steps to Reproduce:
1.Boot up a guest with option clock=vm
 -smp 8,sockets=1,cores=2,threads=4 \
 -m 8192 \
 -monitor stdio \
 -rtc base=utc,clock=vm \
 -serial unix:serial,server,nowait \
 -nodefaults \
 -boot menu=on \
 -vnc :0 \
 -vga virtio \
 -device virtio-scsi-pci,bus=pci.0,addr=0x5 \
 -device scsi-hd,id=scsi-hd0,drive=scsi-hd-dr0,bootindex=0 \
 -drive file=RHEL-7.4.qcow2,if=none,id=scsi-hd-dr0,format=qcow2,cache=none \
 -device scsi-cd,id=scsi-cd0,drive=scsi-cd-dr0,bootindex=1 \
 -drive file=RHEL-7.4-20170330.1-Server-ppc64le-dvd1.iso,if=none,id=scsi-cd-dr0,readonly=on,format=raw,cache=none \
 -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \
 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \
2.check time in  guest 
#date

3.stop guest immediately  in hmp then wait a minute
(qemu)stop

4.continue guest in hmp then show time in guest
(qemu)c
#date
Actual results:
Guest's time doesn't stop when guest is paused.

Verified this issue with latest build:
qemu-kvm-rhev-2.9.0-1.el7.ppc64le
guest:3.10.0-655.el7.test.ppc64le
host:3.10.0-653.el7.ppc64le
Steps to verify:
The same steps as above

Actual results:
Guest's time  stop when guest is paused.
Base the results above,the bug has been fixed.

Comment 22 errata-xmlrpc 2017-08-01 23:29:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 23 errata-xmlrpc 2017-08-02 01:07:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 24 errata-xmlrpc 2017-08-02 01:59:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 25 errata-xmlrpc 2017-08-02 02:40:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 26 errata-xmlrpc 2017-08-02 03:04:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 27 errata-xmlrpc 2017-08-02 03:24:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392