624572 – time drift after guest running for more than 12 hours

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 624572 - time drift after guest running for more than 12 hours

Summary: time drift after guest running for more than 12 hours

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	beta
Target Release:	6.1
Assignee:	Zachary Amsden
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	Rhel6KvmTier1
TreeView+	depends on / blocked

Reported:	2010-08-17 02:24 UTC by Shirley Zhou
Modified:	2015-03-05 00:52 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-04-07 16:59:01 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0534	0	normal	SHIPPED_LIVE	Important: qemu-kvm security, bug fix, and enhancement update	2011-05-19 11:20:36 UTC

Description Shirley Zhou 2010-08-17 02:24:57 UTC

Description of problem:
time drift after guest running for more than 12 hours

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.109.el6.x86_64
kernel-2.6.32-63.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.sync time on host
ntpdate -b clock.redhat.com
2.run rhel6 guest with time option as :
-rtc base=utc,clock=host,driftfix=slew 
3.do sync time on guest
ntpdate -b clock.redhat.com
4.query time on guest
ntpdate -q clock.redhat.com
offset is -0.299896
5.running this guest for 14 hours,then query time again
ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -9.975898, delay 0.34035
17 Aug 09:44:46 ntpdate[4619]: step time server 66.187.233.4 offset -9.975898 sec

Actual results:
After guest run a long time, clock drift have huge increase.

Expected results:
There should not be huge increase time drift after guest running long time.

Additional info:

Comment 2 RHEL Program Management 2010-08-17 02:58:39 UTC

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 Dor Laor 2010-11-21 22:34:26 UTC

Not sure that is that bad but worth investigation.

Comment 4 Zachary Amsden 2011-03-29 18:35:06 UTC

This bug is really old and fixes have gone in to the kvmclock and TSC code since.  Suggest we re-test to verify the bug still exists.

Comment 10 Zachary Amsden 2011-03-30 22:05:38 UTC

I don't have access to Bug 682613, was this last update posted in the wrong bug?

Comment 12 Dor Laor 2011-03-31 11:18:23 UTC

*** Bug 682613 has been marked as a duplicate of this bug. ***

Comment 13 Zachary Amsden 2011-03-31 19:26:03 UTC

So... as for the duplicate, same comment applies.


Host clock drifting is not a virt bug... we can't stop guest clocks drifting
when the host isn't even stable.  Re-assigning component to kernel.

If the host clock drift isn't a regression, it's possible this is just a
drifting or unstable host.  It's also possible the NTP server measurement is
being affected by network latency.  These are rather difficult things to rule
out, but the first step would be to see if the drift still exists with a 6.0
kernel on the same host.


This bug is kind of a mess.  I suggest we re-test to make sure it's really a bug.  Also, the attachment showing the drift results were attached to the other bug.

I'm going to close THIS bug as a duplicate and re-open the other one which is under the proper component.

*** This bug has been marked as a duplicate of bug 682613 ***

Comment 14 Zachary Amsden 2011-03-31 19:36:16 UTC

Okay, pending further investigation, I am re-opening this bug.

PLEASE DO NOT CLOSE EITHER OF THIS OR 682613 as duplicates.  There are two separate issues being reported.

One is a very small drift reported on a system which apparently has a drifting host clock (682613).  Not sure this is a real bug or can even be fixed.  That bug is not a virt issue, but a kernel issue.

This bug (624572) report concerns a virt guest running for over 14 hours having a "huge" drift, -9.9 seconds.  Quantitatively, that is a 200 part per million error, which isn't actually huge, and is within the threshold of NTP correctable error.

Can we please also verify whether or not the host clock is drifting on the same machine for which this bug was reported?

If that does indeed turn out to be the case, then we can dismiss one of these as duplicates, but for now with the absence of any known drift on the host clock, we still cannot rule out a virt bug on this one.

Can we also double check which clocksource was being used in the guest here?  Kvmclock or something else?

Thanks,

Zach

Comment 15 Zachary Amsden 2011-04-05 21:55:48 UTC

Need to very if this is indeed a drifting host or something else.

Comment 16 Mike Cao 2011-04-07 03:06:48 UTC

Tried on kernel-2.6.32-128.el6.

steps:
1. running a guest on it 
2. load the host cpu
3. check the time drift after 12 hours


Actual Results:
ON AMD host : 
after 14 hours ,time drifted 6.6 sec
ON intel host:
after 13 hours ,time drifted 0.3 sec

Comment 17 Zachary Amsden 2011-04-07 16:59:01 UTC

6.6 seconds in 14 hours is less that 140ppm error, which is within hardware expectation and within NTP adjustable tolerance of 500 ppm.

Differing drift on AMD and Intel platforms confirms it is a platform clock stability issue and not a systemic kernel or virtualization problem, so I'm closing the bug.

Note You need to log in before you can comment on or make changes to this bug.