Bug 1391942 - kvmclock: advance clock by time window between vm_stop and pre_save (backport patch)
Summary: kvmclock: advance clock by time window between vm_stop and pre_save (backport...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Marcelo Tosatti
QA Contact: Sitong Liu
URL:
Whiteboard:
Keywords:
Depends On: 1415766
Blocks: 1230126
TreeView+ depends on / blocked
 
Reported: 2016-11-04 12:46 UTC by Marcelo Tosatti
Modified: 2017-08-02 03:32 UTC (History)
10 users (show)

(edit)
Clone Of:
: 1413599 (view as bug list)
(edit)
Last Closed: 2017-08-01 23:37:14 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Marcelo Tosatti 2016-11-04 12:46:16 UTC
Backport $subject to qemu-kvm, opening a new bug because the bug which the patches fixes:

Bug 1230126 - qemu-kvm-rhev: guest time falls behind wall time due to latencies during migration

Also generated another feature from OpenStack.

Comment 7 Miroslav Rezanina 2017-02-20 10:06:23 UTC
Fix included in qemu-kvm-rhev-2.8.0-5.el7

Comment 9 FuXiangChun 2017-03-23 09:53:52 UTC
*****For RHEL7 guest*****.

Reproduced this bug with qemu-kvm-rhev-2.8.0-2.el7.

steps:
1. In guest
# chronyd -q 'server clock.redhat.com iburst'
.....
2017-03-23T02:11:25Z System clock wrong by 0.000888 seconds (step)
...

2. do migration
(qemu) migrate_set_downtime 5 
(qemu) migrate -d tcp:des-ip:5555

3.In guest
# chronyd -q 'server clock.redhat.com iburst'
.....
2017-03-23T02:13:33Z System clock wrong by 0.557039 seconds (step)
...

on destination the difference is great than 0.3s.

4.Verified qemu-kvm-rhev-2.8.0-6.el7

result:
# chronyd -q 'server clock.redhat.com iburst'
2017-03-23T02:30:30Z System clock wrong by 0.029786 seconds (step)
...
on destination the difference is than 0.3s.
--------------------------------------------------------------------------------

For windows guest:

1.qemu command line:

/usr/libexec/qemu-kvm -name 126NICW10S64SOB -enable-kvm -m 4G \
-cpu Opteron_G5,+hv-time -smp 4,cores=4,threads=1,sockets=1,maxcpus=4 .... 

-drive file=/mnt/win10-32-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,serial=mike_cao,cache=none \

-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 ....

2.run w32tm inside guest.

3.do migration

Result:

***win10 32bit guest****:

with unfixed version qemu-kvm-rhev-2.8.0-2.el7. 

on source host(before migration):

C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
Tracking clock.redhat.com [10.5.26.10:123].
The current time is 3/23/2017 4:44:26 PM.
16:44:26 d:+00.2241008s o:+00.9174766s  [   |  *                        ]
16:44:29 d:+00.2267044s o:+00.9162563s  [   |  *                        ]
16:44:31 d:+00.2249251s o:+00.9169113s  [   |  *                        ]
16:44:33 d:+00.2240116s o:+00.9171215s  [   |  *                        ]
16:44:36 d:+00.2245517s o:+00.9171558s  [   |  *                        ]
.......

on destination host(after migration):

16:46:50 d:+00.2247671s o:+02.1162479s  [   |     *                     ]
16:46:52 d:+00.2263641s o:+02.1153835s  [   |     *                     ]
16:46:54 d:+00.2251170s o:+02.1167784s  [   |     *                     ]
16:46:56 d:+00.2247862s o:+02.1162832s  [   |     
.......

with fixed version qemu-kvm-rhev-2.8.0-6.el7.

on source host(before migration):

C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
Tracking clock.redhat.com [10.11.160.238:123].
The current time is 3/23/2017 4:10:21 PM.
16:10:21 d:+00.2677123s o:-00.0902134s  [ *                           ]
16:10:23 d:+00.2682947s o:-00.0901096s  [ *                           ]
16:10:26 d:+00.2693931s o:-00.0891145s  [ *                           ]
16:10:28 d:+00.2678956s o:-00.0900145s  [ *                           ]
......

on destination host(after migration):

16:13:26 d:+00.2667635s o:+01.2436078s  [          |  *                        ]
16:13:28 d:+00.2668133s o:+01.2435769s  [          |  *                        ]
16:13:30 d:+00.2673102s o:+01.2434970s  [          |  *                        ]
.....

--------------------------------------------------------------------------------

***win2016 64bit guest****

For unfixed version:qemu-kvm-rhev-2.8.0-2.el7

on source host(before migration):

C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
Tracking clock.redhat.com [10.5.26.10:123].
The current time is 3/23/2017 4:32:19 PM.
16:32:19, d:+00.2309509s o:+01.0647310s  [   |  *     ]
16:32:22, d:+00.2260079s o:+01.0668221s  [   |  *     ]
16:32:28, d:+00.2264426s o:+01.0664358s  [   |  *     ]
......

on destination host(after migration):
                 ]
16:34:12, d:+00.2236191s o:+02.2148928s  [   |     *  ]
16:34:14, d:+00.2246099s o:+02.2144511s  [   |     *  ]
16:34:16, d:+00.2238134s o:+02.2149809s  [   |     *  ]
.....

with fixed version qemu-kvm-rhev-2.8.0-6.el7

on source host(before migration):

C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
Tracking clock.redhat.com [10.5.27.10:123].
The current time is 3/23/2017 4:23:34 PM.
16:23:34, d:+00.2244444s o:+00.4804321s  [  |*      ]
16:23:37, d:+00.2249752s o:+00.4796549s  [  |*      ]
16:23:39, d:+00.2259971s o:+00.4793132s  [  |*      ]

on destination host(after migration):

destination:
16:26:17, d:+00.2266263s o:+01.6384214s  [  |    *  ]
16:26:19, d:+00.2247433s o:+01.6370785s  [  |    *  ]
16:26:21, d:+00.2247800s o:+01.6371066s  [  |    *  ]

--------------------------------------------------------------------------------

****For win7 64 guest****

with unfixed version qemu-kvm-rhev-2.8.0-2.el7

on source host(before migration):

C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
14:32:24 d:+00.3902358s o:+00.3506544s  [
                ]
14:32:26 d:+00.4683657s o:+00.2609516s  [
                ]
14:32:29 d:+00.0777338s o:+00.2877955s  [
                ]
14:32:31 d:+00.1089799s o:+00.2603336s  [
                ]
14:32:58 d:+00.2808661s o:+00.1743808s  [
                ]
14:33:00 d:+00.4371111s o:+00.2529732s  [
                ]
14:33:03 d:+00.1245491s o:+00.2601335s  [
                ]
14:33:05 d:+00.3433561s o:+00.2114293s  [
                ]
14:33:08 d:+00.3746162s o:+00.3589341s  [
                ]
14:33:10 d:+00.2652467s o:+00.3511051s  [
                ]
14:33:23 d:+00.0621166s o:+00.2905380s  [
                ]
14:33:25 d:+00.5464970s o:+00.3251897s  [


on destination host(after migration):

14:34:03 d:+00.2808629s o:+00.2201507s  [
                ]
14:34:06 d:+00.2964919s o:+00.1780835s  [
                ]
14:34:08 d:+00.3433504s o:+00.2550374s  [
                ]
14:34:11 d:+00.2183614s o:+00.2878417s  [
                ]
14:34:13 d:+00.4683550s o:+00.2666920s  [
                ]
14:34:16 d:+00.3277345s o:+00.1960390s  [
                ]
14:34:18 d:+00.2808771s o:+00.1771128s  [
                ]
14:34:20 d:+00.6245454s o:+00.3982722s  [
                ]
14:34:23 d:+00.2964774s o:+00.1755881s  [
                ]
14:34:26 d:+00.2183604s o:+00.2056522s  [
                ]
14:34:28 d:+00.6089874s o:+00.3381553s  [


fixed version with qemu-kvm-rhev-2.8.0-6.el7

on source host(before migration):

15:08:57 d:+00.2655363s o:-00.2389204s  [ *|
                ]
15:09:00 d:+00.2655409s o:-00.2384922s  [ *|
                ]
15:09:02 d:+00.2655334s o:-00.2390295s  [ *|
                ]

on destination host(after migration):

15:09:04 d:+00.2967374s o:-00.2206632s  [ *|
                ]
15:09:06 d:+00.3124113s o:-00.2078223s  [ *|
                ]
15:09:09 d:+00.2968022s o:-00.2118912s  [ *|
-------------------------------------------------------------------------------

Marcelo,

I have two questions about this test result.

Q1)Could you help confirm this test result see if it can verify this bug? To my understanding. For RHEV7 guest. This bug is fixed.  But for windows guest. I am not sure.

Q2) For windows guest. I only tested 3 guests(win10-32bit, win2016, win7-64bit)
Do I need to all guests you mentioned in comment0?

Q1)

Comment 10 Marcelo Tosatti 2017-03-23 17:48:52 UTC
(In reply to FuXiangChun from comment #9)
> *****For RHEL7 guest*****.
> 
> Reproduced this bug with qemu-kvm-rhev-2.8.0-2.el7.
> 
> steps:
> 1. In guest
> # chronyd -q 'server clock.redhat.com iburst'
> .....
> 2017-03-23T02:11:25Z System clock wrong by 0.000888 seconds (step)
> ...
> 
> 2. do migration
> (qemu) migrate_set_downtime 5 
> (qemu) migrate -d tcp:des-ip:5555
> 
> 3.In guest
> # chronyd -q 'server clock.redhat.com iburst'
> .....
> 2017-03-23T02:13:33Z System clock wrong by 0.557039 seconds (step)
> ...
> 
> on destination the difference is great than 0.3s.
> 
> 4.Verified qemu-kvm-rhev-2.8.0-6.el7
> 
> result:
> # chronyd -q 'server clock.redhat.com iburst'
> 2017-03-23T02:30:30Z System clock wrong by 0.029786 seconds (step)
> ...
> on destination the difference is than 0.3s.
> -----------------------------------------------------------------------------
> ---
> 
> For windows guest:
> 
> 1.qemu command line:
> 
> /usr/libexec/qemu-kvm -name 126NICW10S64SOB -enable-kvm -m 4G \
> -cpu Opteron_G5,+hv-time -smp 4,cores=4,threads=1,sockets=1,maxcpus=4 .... 
> 
> -drive
> file=/mnt/win10-32-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,
> serial=mike_cao,cache=none \
> 
> -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 ....
> 
> 2.run w32tm inside guest.
> 
> 3.do migration
> 
> Result:
> 
> ***win10 32bit guest****:
> 
> with unfixed version qemu-kvm-rhev-2.8.0-2.el7. 
> 
> on source host(before migration):
> 
> C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
> Tracking clock.redhat.com [10.5.26.10:123].
> The current time is 3/23/2017 4:44:26 PM.
> 16:44:26 d:+00.2241008s o:+00.9174766s  [   |  *                        ]
> 16:44:29 d:+00.2267044s o:+00.9162563s  [   |  *                        ]
> 16:44:31 d:+00.2249251s o:+00.9169113s  [   |  *                        ]
> 16:44:33 d:+00.2240116s o:+00.9171215s  [   |  *                        ]
> 16:44:36 d:+00.2245517s o:+00.9171558s  [   |  *                        ]
> .......
> 
> on destination host(after migration):
> 
> 16:46:50 d:+00.2247671s o:+02.1162479s  [   |     *                     ]
> 16:46:52 d:+00.2263641s o:+02.1153835s  [   |     *                     ]
> 16:46:54 d:+00.2251170s o:+02.1167784s  [   |     *                     ]
> 16:46:56 d:+00.2247862s o:+02.1162832s  [   |     
> .......
> 
> with fixed version qemu-kvm-rhev-2.8.0-6.el7.
> 
> on source host(before migration):
> 
> C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
> Tracking clock.redhat.com [10.11.160.238:123].
> The current time is 3/23/2017 4:10:21 PM.
> 16:10:21 d:+00.2677123s o:-00.0902134s  [ *                           ]
> 16:10:23 d:+00.2682947s o:-00.0901096s  [ *                           ]
> 16:10:26 d:+00.2693931s o:-00.0891145s  [ *                           ]
> 16:10:28 d:+00.2678956s o:-00.0900145s  [ *                           ]
> ......
> 
> on destination host(after migration):
> 
> 16:13:26 d:+00.2667635s o:+01.2436078s  [          |  *                     
> ]
> 16:13:28 d:+00.2668133s o:+01.2435769s  [          |  *                     
> ]
> 16:13:30 d:+00.2673102s o:+01.2434970s  [          |  *                     
> ]
> .....
> 
> -----------------------------------------------------------------------------
> ---
> 
> ***win2016 64bit guest****
> 
> For unfixed version:qemu-kvm-rhev-2.8.0-2.el7
> 
> on source host(before migration):
> 
> C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
> Tracking clock.redhat.com [10.5.26.10:123].
> The current time is 3/23/2017 4:32:19 PM.
> 16:32:19, d:+00.2309509s o:+01.0647310s  [   |  *     ]
> 16:32:22, d:+00.2260079s o:+01.0668221s  [   |  *     ]
> 16:32:28, d:+00.2264426s o:+01.0664358s  [   |  *     ]
> ......
> 
> on destination host(after migration):
>                  ]
> 16:34:12, d:+00.2236191s o:+02.2148928s  [   |     *  ]
> 16:34:14, d:+00.2246099s o:+02.2144511s  [   |     *  ]
> 16:34:16, d:+00.2238134s o:+02.2149809s  [   |     *  ]
> .....
> 
> with fixed version qemu-kvm-rhev-2.8.0-6.el7
> 
> on source host(before migration):
> 
> C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
> Tracking clock.redhat.com [10.5.27.10:123].
> The current time is 3/23/2017 4:23:34 PM.
> 16:23:34, d:+00.2244444s o:+00.4804321s  [  |*      ]
> 16:23:37, d:+00.2249752s o:+00.4796549s  [  |*      ]
> 16:23:39, d:+00.2259971s o:+00.4793132s  [  |*      ]
> 
> on destination host(after migration):
> 
> destination:
> 16:26:17, d:+00.2266263s o:+01.6384214s  [  |    *  ]
> 16:26:19, d:+00.2247433s o:+01.6370785s  [  |    *  ]
> 16:26:21, d:+00.2247800s o:+01.6371066s  [  |    *  ]
> 
> -----------------------------------------------------------------------------
> ---
> 
> ****For win7 64 guest****
> 
> with unfixed version qemu-kvm-rhev-2.8.0-2.el7
> 
> on source host(before migration):
> 
> C:\Users\Administrator>w32tm /stripchart /computer:clock.redhat.com
> 14:32:24 d:+00.3902358s o:+00.3506544s  [
>                 ]
> 14:32:26 d:+00.4683657s o:+00.2609516s  [
>                 ]
> 14:32:29 d:+00.0777338s o:+00.2877955s  [
>                 ]
> 14:32:31 d:+00.1089799s o:+00.2603336s  [
>                 ]
> 14:32:58 d:+00.2808661s o:+00.1743808s  [
>                 ]
> 14:33:00 d:+00.4371111s o:+00.2529732s  [
>                 ]
> 14:33:03 d:+00.1245491s o:+00.2601335s  [
>                 ]
> 14:33:05 d:+00.3433561s o:+00.2114293s  [
>                 ]
> 14:33:08 d:+00.3746162s o:+00.3589341s  [
>                 ]
> 14:33:10 d:+00.2652467s o:+00.3511051s  [
>                 ]
> 14:33:23 d:+00.0621166s o:+00.2905380s  [
>                 ]
> 14:33:25 d:+00.5464970s o:+00.3251897s  [
> 
> 
> on destination host(after migration):
> 
> 14:34:03 d:+00.2808629s o:+00.2201507s  [
>                 ]
> 14:34:06 d:+00.2964919s o:+00.1780835s  [
>                 ]
> 14:34:08 d:+00.3433504s o:+00.2550374s  [
>                 ]
> 14:34:11 d:+00.2183614s o:+00.2878417s  [
>                 ]
> 14:34:13 d:+00.4683550s o:+00.2666920s  [
>                 ]
> 14:34:16 d:+00.3277345s o:+00.1960390s  [
>                 ]
> 14:34:18 d:+00.2808771s o:+00.1771128s  [
>                 ]
> 14:34:20 d:+00.6245454s o:+00.3982722s  [
>                 ]
> 14:34:23 d:+00.2964774s o:+00.1755881s  [
>                 ]
> 14:34:26 d:+00.2183604s o:+00.2056522s  [
>                 ]
> 14:34:28 d:+00.6089874s o:+00.3381553s  [
> 
> 
> fixed version with qemu-kvm-rhev-2.8.0-6.el7
> 
> on source host(before migration):
> 
> 15:08:57 d:+00.2655363s o:-00.2389204s  [ *|
>                 ]
> 15:09:00 d:+00.2655409s o:-00.2384922s  [ *|
>                 ]
> 15:09:02 d:+00.2655334s o:-00.2390295s  [ *|
>                 ]
> 
> on destination host(after migration):
> 
> 15:09:04 d:+00.2967374s o:-00.2206632s  [ *|
>                 ]
> 15:09:06 d:+00.3124113s o:-00.2078223s  [ *|
>                 ]
> 15:09:09 d:+00.2968022s o:-00.2118912s  [ *|
> -----------------------------------------------------------------------------
> --
> 
> Marcelo,
> 
> I have two questions about this test result.
> 
> Q1)Could you help confirm this test result see if it can verify this bug? To
> my understanding. For RHEV7 guest. This bug is fixed.  But for windows
> guest. I am not sure.

Win 10 32-bits, Win2016 64-bits: since they should be reading time from hyper-V MSR, the problem should be fixed there. I am confirming they do use hyper-V enlightenment by default without additional options.

Win7 64-bits: guest stopped part of VM migration should take longer than 5s, which does not seem to be the case (the guest should have a lot of memory dirty, say copy several gigabyte sized files before migration), so migration time in the unfixed case should take longer. 

> 
> Q2) For windows guest. I only tested 3 guests(win10-32bit, win2016,
> win7-64bit)
> Do I need to all guests you mentioned in comment0?

Since we are having problems with the few tested, no, no need to test them all.

Thanks.

Comment 11 Marcelo Tosatti 2017-03-24 18:55:02 UTC
(In reply to Marcelo Tosatti from comment #10)
> (In reply to FuXiangChun from comment #9)

FuXiangChun,


> Win 10 32-bits, Win2016 64-bits: since they should be reading time from
> hyper-V MSR, the problem should be fixed there. I am confirming they do use
> hyper-V enlightenment by default without additional options.
> 
> Win7 64-bits: guest stopped part of VM migration should take longer than 5s,
> which does not seem to be the case (the guest should have a lot of memory
> dirty, say copy several gigabyte sized files before migration), so migration
> time in the unfixed case should take longer. 
> 
> > 
> > Q2) For windows guest. I only tested 3 guests(win10-32bit, win2016,
> > win7-64bit)
> > Do I need to all guests you mentioned in comment0?
> 
> Since we are having problems with the few tested, no, no need to test them
> all.
> 
> Thanks.

Windows 10 32-bit is using the reference TSC page.
The testing i performed was with Windows 2012 IIRC,
which was using REF_COUNT MSR, so there time was migrated
with small delta.

Ok, i'll open a new BZ for RHEL 7.5 for more accurate
time sync and Windows.

So testing achieves correct results.

Comment 12 FuXiangChun 2017-03-27 01:48:13 UTC
Thanks Marcelo,

According to comment11, I will set this bug as verified.  If possible, please add new bug id to this bug or cc me(xfu@redhat.com).

Comment 13 Marcelo Tosatti 2017-03-29 16:51:11 UTC
(In reply to FuXiangChun from comment #12)
> Thanks Marcelo,
> 
> According to comment11, I will set this bug as verified.  If possible,
> please add new bug id to this bug or cc me(xfu@redhat.com).

New bug ID: https://bugzilla.redhat.com/show_bug.cgi?id=1437166

Comment 15 errata-xmlrpc 2017-08-01 23:37:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 01:14:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 02:06:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 02:47:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 19 errata-xmlrpc 2017-08-02 03:12:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 20 errata-xmlrpc 2017-08-02 03:32:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.