Bug 1170132

Summary: Guest time could change with host time even specify the guest clock as "-rtc base=utc,clock=vm,..."
Product: Red Hat Enterprise Linux 7 Reporter: Gu Nini <ngu>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: dgibson, fnovak, juzhang, knoel, michen, ngu, qzhang, sherold, virt-maint, xuhan, xuma, ypu, zhengtli
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.2.0-8.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-04 16:22:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1171700    
Attachments:
Description Flags
serial log for stop/continue
none
qemu-kvm-rhev-2.1.2-16.el7.test.ppc64.rpm
none
qemu-kvm-common-rhev-2.1.2-16.el7.test.ppc64.rpm
none
qemu-img-rhev-2.1.2-16.el7.test.ppc64.rpm
none
qemu-kvm-rhev-debuginfo-2.1.2-16.el7.test.ppc64.rpm
none
hwclock lags or rolls back after stop&cont 2014-12-12 19:53:54.png none

Description Gu Nini 2014-12-03 11:09:10 UTC
Description of problem:
When define the guest with the clock as "-rtc base=utc,clock=vm,...", after host system time change, the guest time(both system time and hwclock time) is still the same as the host system time after reboot

Version-Release number of selected component (if applicable):
Host kernel: 3.10.0-201.el7.ppc64
Guest kernel: 3.10.0-195.el7.ppc64/3.10.0-196.ael7a.ppc64le
Qemu-kvm:
qemu-img-rhev-2.1.2-14.el7.ppc64
qemu-kvm-tools-rhev-2.1.2-14.el7.ppc64
qemu-kvm-common-rhev-2.1.2-14.el7.ppc64
qemu-kvm-rhev-2.1.2-14.el7.ppc64
qemu-kvm-rhev-debuginfo-2.1.2-14.el7.ppc64

How reproducible:
100%

Steps to Reproduce:
1. Define and start a guest(can be BE or LE) in virsh cmd with an xml file including following segment
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup' track='guest'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
2. After the guest boot up, change the host system time one hour forward; then reboot the guest
3. After the guest reboot up, check its system time and hwclock time

Actual results:
System time and hwclock time of the guest are still the same as host system time

Expected results:
The guest system time and hwclock time should not be changed with the host system time

Additional info:
When change the 'tickpolicy' parameter to 'delay': <timer name='rtc' tickpolicy='delay' track='guest'/>, the test result is the same as above; and when the parameter is <... tickpolicy='discard' ...> or <... tickpolicy='merge' ...>, it's not supported.

Comment 1 Joy Pu 2014-12-03 11:15:03 UTC
Created attachment 964096 [details]
serial log for stop/continue

Comment 3 Joy Pu 2014-12-03 11:18:55 UTC
Find similar problems when test it with stop/continue operation.

Just change the step 2 to stop guest for 5 minutes. And then continue it.

The attachment is the Call trace get from serial log.

Comment 4 David Gibson 2014-12-04 01:31:41 UTC
Hi, thanks for the report.  Can a clarify a few things:

1) What method did you use to reboot the guest?  Was it shutdown then restarted using virsh, or just rebooted with qemu still running?

If the guest was shutdown, can you show the output from virsh dumpxml between the first shutdown and the restart - in this case I'd expect the offset='utc' to be automatically changed to offset='variable'

2) What's the qemu command line which your libvirt XML produces?

3) To clarify what the incorrect behaviour is here, I believe you're expecting the RTC to only measure time spent within the guest because of the track='guest'.  Is that correct?



NOTE: The 'pit' and 'hpet' clauses should be omitted on Power - they control PC platform specific devices and will have no effect for a Power guest.

Comment 5 Gu Nini 2014-12-04 01:54:31 UTC
(In reply to David Gibson from comment #4)

1) I rebooted the guest inside the guest with cmd "reboot".

2) Following is the qemu command line that I used: For the BE guest named 'rhel7.1', the libvirt xml segment is '<timer name='rtc' tickpolicy='catchup' track='guest'/>'; for the LE guest named 'rhel7.1-le', the libvirt xml segment is '<timer name='rtc' tickpolicy='delay' track='guest'/>'

[root@ibm-p8-kvm-01-qe ngu]# ps aux|grep qemu
qemu      84738  0.2  0.9 20405312 1275776 ?    Sl   05:43   2:08 /usr/libexec/qemu-kvm -name rhel7.1 -S -machine pseries,accel=kvm,usb=off -m 16384 -realtime mlock=off -smp 32,sockets=1,cores=4,threads=8 -uuid 95346a10-1828-403a-a610-ac5a52a29479 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,clock=vm,driftfix=slew -no-shutdown -boot strict=on -device usb-ehci,id=usb,bus=pci.0,addr=0x2 -device pci-ohci,id=usb1,bus=pci.0,addr=0x1 -device spapr-vscsi,id=scsi0,reg=0x1000 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/home/ngu/rhel7.1.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -netdev tap,fd=27,id=hostnet0 -device spapr-vlan,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:19,reg=0x2000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/rhel7.1.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev socket,id=charconsole0,path=/var/lib/libvirt/qemu/rhel7.1-serial.sock,server,nowait -device virtconsole,bus=virtio-serial0.0,nr=1,chardev=charconsole0,id=console0 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -vnc 0:7 -device VGA,id=video0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -object rng-random,id=rng0,filename=/dev/random -device virtio-rng-pci,rng=rng0,max-bytes=1234,period=2000,bus=pci.0,addr=0x6 -msg timestamp=on

qemu     144527  7.8  0.6 20339008 883200 ?     Sl   20:40   0:32 /usr/libexec/qemu-kvm -name rhel7.1-le -S -machine pseries,accel=kvm,usb=off -m 16384 -realtime mlock=off -smp 32,sockets=1,cores=4,threads=8 -uuid 95346a10-1828-403a-a610-ac5a52a29472 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.1-le.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,clock=vm -no-shutdown -boot strict=on -device usb-ehci,id=usb,bus=pci.0,addr=0x2 -device pci-ohci,id=usb1,bus=pci.0,addr=0x1 -device spapr-vscsi,id=scsi0,reg=0x1000 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/home/ngu/rhel7.1-le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -netdev tap,fd=25,id=hostnet0 -device spapr-vlan,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:12,reg=0x2000 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/rhel7.1-le.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev socket,id=charconsole0,path=/var/lib/libvirt/qemu/rhel7.1-le-serial.sock,server,nowait -device virtconsole,bus=virtio-serial0.0,nr=1,chardev=charconsole0,id=console0 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -vnc 0:6 -device VGA,id=video0,bus=pci.0,addr=0x4 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -object rng-random,id=rng0,filename=/dev/random -device virtio-rng-pci,rng=rng0,max-bytes=1234,period=2000,bus=pci.0,addr=0x6 -msg timestamp=on

3) That's right.

Comment 6 David Gibson 2014-12-04 03:49:03 UTC
Ah, yes, it looks like the pseries RTAS real time clock implementation doesn't honour the -rtc clock=vm option to qemu.

It will take me a little while to address this one.

Comment 7 David Gibson 2014-12-10 03:53:39 UTC
Created attachment 966607 [details]
qemu-kvm-rhev-2.1.2-16.el7.test.ppc64.rpm

Comment 8 David Gibson 2014-12-10 04:08:49 UTC
Created attachment 966622 [details]
qemu-kvm-common-rhev-2.1.2-16.el7.test.ppc64.rpm

Comment 9 David Gibson 2014-12-10 04:09:23 UTC
Created attachment 966623 [details]
qemu-img-rhev-2.1.2-16.el7.test.ppc64.rpm

Comment 10 David Gibson 2014-12-10 04:10:56 UTC
Created attachment 966624 [details]
qemu-kvm-rhev-debuginfo-2.1.2-16.el7.test.ppc64.rpm

Comment 11 David Gibson 2014-12-10 04:18:17 UTC
I'm having problems brew building again, so I've uploaded some manually built packages which have a draft fix for this bug.

I believe this should stop the guest rtc changing when the host time is changed if clock=vm is specified.  I'm less sure about the suspend case.

Can you please try these out.  Note that these patches are a work in progress and will need some upstream review before they're ready to merge.

Comment 12 IBM Bug Proxy 2014-12-10 13:41:44 UTC
------- Comment From fnovak.com 2014-12-10 13:30 EDT-------
reverse mirror of RHBZ Bug 1170132 - Guest time could change with host time even specify the guest clock as "-rtc base=utc,clock=vm,..."

Comment 13 David Gibson 2014-12-11 05:06:54 UTC
RFC patch series posted upstream.

See http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg01452.html

Comment 14 David Gibson 2014-12-11 05:10:13 UTC
I have now successfully brewed the draft patches at:

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8363604

Comment 15 Gu Nini 2014-12-11 09:46:40 UTC
(In reply to David Gibson from comment #11/14)

Have upgrade the patches to do test, no the bug problem any more, i.e. guest time would not change with the host time when set guest clock as "-rtc clock=vm,...".

Comment 16 David Gibson 2014-12-12 00:05:47 UTC
Thanks for the feedback.

Have you tested stopping / continuing the guest as well as changing the host time?

I am currently working towards getting this fix upstream.

Comment 17 Gu Nini 2014-12-12 12:03:23 UTC
Created attachment 967603 [details]
hwclock lags or rolls back after stop&cont 2014-12-12 19:53:54.png

(In reply to David Gibson from comment #16)
> Thanks for the feedback.
> 
> Have you tested stopping / continuing the guest as well as changing the host
> time?
> 
> I am currently working towards getting this fix upstream.

Have done some stop/cont test on the test package https://brewweb.devel.redhat.com/taskinfo?taskID=8368750, met guest hwclock lagging or rolling back issue with following step if booted the guest with "-rtc base=2006-06-06,clock=vm":

1.Check the guest system time and hwclock
2.Stop the guest with qemu cmd "stop"
3.Change the host system time 1 hour ahead
4.Wait for 3-5 mins, then resume the guest with qemu cmd "cont"
5.Check the guest system time and hwclock again, then it's found the hwclock ticked not as right freqency.

This happens on both BE and LE guest.

Comment 18 David Gibson 2014-12-14 22:44:20 UTC
Sorry, I'm not sure what you mean by "5.Check the guest system time and hwclock again, then it's found the hwclock ticked not as right freqency."

Comment 19 Gu Nini 2014-12-15 12:43:26 UTC
(In reply to David Gibson from comment #18)

That's I found the hwclock lagged or rolled back compared with that in step 1.

Comment 20 David Gibson 2014-12-16 00:36:29 UTC
That's the expected behaviour then, isn't it?  With clock=vm, the guest RTC should not advance while the guest is stopped.

Comment 21 Gu Nini 2014-12-29 03:41:44 UTC
(In reply to David Gibson from comment #20)
> That's the expected behaviour then, isn't it?  With clock=vm, the guest RTC
> should not advance while the guest is stopped.

David,

You are right, it's my fault. Thanks.

Comment 22 IBM Bug Proxy 2015-01-12 21:22:02 UTC
------- Comment From seg.com 2015-01-12 21:13 EDT-------
So, is the conclusion that this is not a bug? If that's the case, we should close. If not, can someone please summarize where we are with this bug? It's become hard to follow.

Comment 23 David Gibson 2015-01-13 00:05:17 UTC
No, the conclusion is that my draft patches appear to be correctly fix the problem.  As noted in bug 1171700 though, they're not upstream yet.

Comment 24 Gu Nini 2015-01-23 13:34:16 UTC
Also found the problem on ibm power system:

Host kernel: 3.10.53-2020.1.pkvm2_1_1.49.ppc64/3.10.42-2018.1.pkvm2_1_1.46.ppc64
Guest kernel: 3.10.0-223.el7.ppc64/3.10.0-223.ael7b.ppc64le
Qemu-kvm version:
qemu-img-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-kvm-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-common-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-system-ppc-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-kvm-tools-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-system-x86-2.0.0-2.1.pkvm2_1_1.20.40.ppc64
qemu-2.0.0-2.1.pkvm2_1_1.20.40.ppc64

Comment 25 David Gibson 2015-01-28 04:00:49 UTC
Gu Nini,

Thanks for the confirmation.  That was expected, since it is an upstream bug.  I've been busy and haven't had a chance to resend my upstream patches; I'm hoping I'll be able to do that today.

Comment 26 David Gibson 2015-01-29 04:59:22 UTC
Draft patches posted upstream, see https://www.mail-archive.com/qemu-devel@nongnu.org/msg271868.html

Still under discussion.

Comment 28 Miroslav Rezanina 2015-03-19 09:08:11 UTC
Fix included in qemu-kvm-rhev-2.2.0-8.el7

Comment 30 Gu Nini 2015-07-30 10:57:10 UTC
Verify the bug on both ppc64be and ppc64le host with following sw versions:

PPC64BE HOST:
host kernel: 3.10.0-300.el7.ppc64
Guest kernel: 3.10.0-295.el7.ppc64/3.10.0-295.el7.ppc64le
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-13.el7.ppc64

PPC64LE HOST:
host kernel: 3.10.0-300.el7.ppc64le
Guest kernel: 3.10.0-295.el7.ppc64/3.10.0-295.el7.ppc64le
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-13.el7.ppc64le


Now when set the guest clock as "-rtc base=utc,clock=vm" or "-rtc base=2006-06-06,clock=vm", the guest time would not change with the host time with the steps in the bug.

Comment 31 Gu Nini 2015-07-30 11:24:26 UTC
However, I should re-emphasize the problem in comments 17-21 with following steps:

1) Start a guest with guest clock "-rtc base=utc,clock=vm" or "-rtc base=2006-06-06,clock=vm"
2) After the guest boots up, check its system time and hwclock
3) 'stop' the guest, then change the host system time with cmd 'date -s XX:XX:XX'
4) Wait for 5 mins, then 'cont' the guest, and check its system time and hwclock

Then it's found the guest hwclock would tick with the same frenqency as the host system time, i.e., if I changed the host system time one hour behind, it would tick quicker to chase up the time server(maybe), then the guest hwclock would ticks quicker too, I can check this from the time difference between guest system time and hwclock; if I changed the host system time one hour ahead, it would tick slower to wait for the time server, then the guest hwclock would ticks slower too, so I had found the guest hwclock lags or rolls back as showed in comment 17.

David,

Would you help to check if this is a problem and should be report a new bug?

Thanks!

Nini

Comment 32 David Gibson 2015-07-31 02:22:58 UTC
Hi Nini,

Thanks for the description.  I think I didn't properly understand the problem the first time you mentioned it.  Basically, with clock=vm the guest is unaffected by sudden jumps in the host time, but *is* affected by gradual slew adjustments in the host.  This makes sense, because the guest hwclock is based on the host's CLOCK_MONOTONIC, which is unaffected by discontinuous changes in the host time, but is affected by slew adjustments.

Exactly what the semantics of clock=vm should be in this case is unclear, but since it is described as having the guest clock disconnected from the host, it makes some sense that it not be affected by host slew adjustments.

I think it would be possible to fix this using the kernel's CLOCK_MONOTONIC_RAW time source, however doing this would require substantial rework of qemu's core timing code: as far as I can tell the behaviour will be the same on x86.

I don't think this is a problem in practice; it only shows up in a rather contrived edge case.  If you like, you can file a new BZ to track it, but I don't expect it to be changed for RHEL 7.2, and may end up closed as WONTFIX.

Comment 33 Gu Nini 2015-07-31 03:38:46 UTC
(In reply to David Gibson from comment #32)

David,

Thanks for the detailed explanation. I reported bz 1248860 to track the issue in comment 17-21/31-32, I agree with you that can close the bug as WONTFIX, but just use it as a problem tracker.

Comment 34 Qunfang Zhang 2015-08-03 05:40:39 UTC
Setting to VERIFIED according to comment 30.

Comment 35 IBM Bug Proxy 2015-08-28 04:42:48 UTC
------- Comment From seg.com 2015-08-28 04:34 EDT-------
Based on where we are in the product cycle and because of the potential for affecting existing customers, we are going to defer fixing this to the next product version.

Comment 36 Zhengtong 2015-09-16 02:52:38 UTC
Hi David, 

I found some interesting result in my test according to comment #20

Test version:
Host:3.10.0-313.el7.ppc64le
Qemu:qemu-kvm-rhev-2.3.0-22.el7
Guest:3.10.0-229.14.1.ael7b.ppc64le

step:
1. Boot up guest with "-rct base=utc,clock=vm"
2. Check the guest time 
Guest: #date
       Tue Sep 15 22:47:38 EDT 2015

3. Do this step Immediately after step 2. Stop guest in HMP
Host: (qemu)stop

4. After 2-3 minutes, continue guest , and check time in guest
Host: (qemu)cont
Guest: #date
       Tue Sep 15 22:50:40 EDT 2015


Seems the time is still advancing while guest status is paused. Is this normal or a bug ?

Comment 37 David Gibson 2015-09-16 05:11:23 UTC
Zhengtong,

With clock=vm the clock should not advance while the guest is paused, so this is another bug.  Please file a new BZ for this - target it for RHEL7.3, this isn't important enough to fix for RHEL 7.2.

Comment 38 Xujun Ma 2015-09-18 03:31:00 UTC
(In reply to David Gibson from comment #37)
I have filed a new bug about this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=1264258

Comment 40 errata-xmlrpc 2015-12-04 16:22:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html