Bug 828332 - Track recommended hypervisor timer settings
Summary: Track recommended hypervisor timer settings
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libosinfo
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthias Clasen
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-04 15:36 UTC by Cole Robinson
Modified: 2018-09-04 18:30 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-09-04 18:30:24 UTC
Embargoed:


Attachments (Terms of Use)

Description Cole Robinson 2012-06-04 15:36:31 UTC
Various OS require specific qemu command line parameters/libvirt XML to track time optimally. Tracking this in libosinfo would be ideal.

Here's the matrix KVM team has on an internal wiki:

|| OS || qemu || guest's kernel cmdline || newer AMD host || old AMD
host (!constant_tsc) || Intel ||
|| RHEL 5.4 64 bit with pv clock || -no-kvm-pit-reinjection -rtc-td-hack
|| none || || || ||
|| RHEL 5.4 64 bit without pv clock || -no-kvm-pit-reinjection
-rtc-td-hack || divider=10 notsc lpj=n || || || ||
|| RHEL 5.4 32 bit with pv clock || -no-kvm-pit-reinjection -rtc-td-hack
||none || || || ||
|| RHEL 5.4 32 bit without pv clock || -no-kvm-pit-reinjection
-rtc-td-hack || divider=10 clocksource=acpi_pm lpj=n || || || ||
|| RHEL 5.3 64 bit || -no-kvm-pit-reinjection -rtc-td-hack || divider=10
notsc || || || ||
|| RHEL 5.3 32 bit || -no-kvm-pit-reinjection -rtc-td-hack || divider=10
clocksource=acpi_pm || || || ||
|| RHEL 4.8 64 bit || -no-kvm-pit-reinjection -rtc-td-hack || notsc
divider=10 || || || ||
|| RHEL 4.8 32 bit || -no-kvm-pit-reinjection -rtc-td-hack ||
clock=pmtmr divider=10 || || || ||
|| RHEL 3.9 64 bit || -no-kvm-pit-reinjection -rtc-td-hack || none || ||
|| ||
|| RHEL 3.9 32 bit || -no-kvm-pit-reinjection -rtc-td-hack || none || ||
|| ||
|| win2k3 64 bit || -rtc-td-hack || || || /use pmtimer in boot.ini || ||
|| win2k3 32 bit || -rtc-td-hack || || || /use pmtimer in boot.ini || ||
|| win2k8 64 bit || -rtc-td-hack || || || NO NEED TO USE PMTIMER || ||
|| win2k8 32 bit || -rtc-td-hack || || || NO NEED TO USE PMTIMER || ||
|| winxp 32 bit || -rtc-td-hack || || || /use pmtimer in boot.ini || ||

Comment 1 Daniel Berrangé 2012-06-11 21:05:02 UTC
As a point of reference, this is how those two pit/rtc flags map to libvirt XML

  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>

Comment 2 starlight 2013-10-11 07:02:35 UTC
VM guest time-keeping is without a doubt
the closet thing to black magic found in
the realm of computers.

Much of the above has changed in just
the last year.

I've stumbled on the best case I've seen
so far for RHEL 4.9 so I'm documenting it
here.

First: "-rtc-td-hack" appears to be gone.

The 'libvirt' XML is

<clock offset='utc'>
  <timer name='rtc' tickpolicy='catchup' track='guest'/>
  <timer name='pit' tickpolicy='delay'/>
  <timer name='hpet' present='no'/>
</clock>

The QEMU options that result

  -rtc base=utc,clock=vm,driftfix=slew
  -no-kvm-pit-reinjection

The RHEL 4.9 guest boot line can

a) have no clock parameters
b) have "clock=pmtmr" or
c) have "clock=pmtmr divider=100"

a/b/c all seem equivalent and result in
about 10 to 30 milliseconds negative
drift every five minutes.

"divider=10" is bad, causes rapid forward drift.

------

I have

  /usr/local/bin/ntpd -q -l /dev/null

set to run every five minutes in the
'crontab'.  /etc/ntpd.conf points to
a high quality CDMA time server that's
accurate to +/- 5 microseconds.

This works far better than running
'ntpd' continuously IMO.

The host server runs 'ntpd' against
the same CDMA time source and keeps time
to +/- 100 microseconds.

------

have set scheduler class and priorities
of guest VM threads and host 'ntpd':

# ps -Lce | fgrep -v ' TS '
  PID   LWP CLS PRI TTY     TIME CMD
 1647  1647 FF   70 ?   00:00:01 ntpd
  871   871 FF   60 ?   00:00:00 kvm-irqfd-clean
 4294  4294 FF   60 ?   00:00:04 qemu-kvm
 4294  4310 FF   60 ?   00:03:50 qemu-kvm
 4307  4307 FF   61 ?   00:00:20 kvm-pit/4294
 4308  4308 FF   60 ?   00:00:00 vhost-4294
 4309  4309 FF   60 ?   00:00:00 vhost-4294

versions
========
HOST
CentOS 6.4
kernel.org 3.10.15
qemu-img-0.12.1.2-2.355.0.1.el6_4.9.x86_64
qemu-kvm-0.12.1.2-2.355.0.1.el6_4.9.x86_64

GUEST
CentOS 4.9
CentOS 2.6.9-103.EL

CPU
===
AMD Athlon 4450B 2.3GHz

Comment 3 Cole Robinson 2013-10-11 12:29:31 UTC
Thanks for the info. There was just some discussion about ideal defaults internally, and what the qemu guys recommended for all OS was

<clock offset='utc'>
  <timer name='rtc' tickpolicy='catchup'/>
  <timer name='pit' tickpolicy='delay'/>
  <timer name='hpet' present='no'/>
</clock>

So basically what you have without the track='guest' bit. Though we didn't discuss guest kernel options at all.

FWIW -rtc-td-hack is the same thing as -rtc driftfix=slew

Comment 4 starlight 2013-10-11 17:30:08 UTC
I tried it without track='guest' and it does
not seem to work as well.

Reading about it, seems the idea with
the first-up-above config is to have the
one time source catch up (RTC), another drop
ticks (PIT, since the 2.6.9 kernel expects
to compensate for lost hardware ticks)
and have the virtual TSC run relative
to the guest's time instead of the hosts
time.  Some guessing here.

-----

Also I've found that the in the guest

   a) have no clock parameters

which is probably equivalent to

   "clock=pmtmr divider=1000"

results in high idle CPU consumption
from interrupts, so I've settled on

   c) have "clock=pmtmr divider=100"

Of course "divider=10" will use even less
CPU but when a tick is missed the clock jumps
100ms and that's too much for me.  This
config runs about 10% of a core at idle.

-----

Also found that 'ntpd' runs pretty good
on the guest where it the past the clock
was so unstable that it would get hopelessly
lost and stay that way.  Config is


# Running in a VM.
tinker panic 0
tinker step 0.500
#disable ntp

# CDMA time servers.
server 10.29.78.3  minpoll  4 maxpoll  4 prefer
server 10.29.78.1  minpoll  4 maxpoll  4

# Resort to physical host clock if CDMA unreachable.
server 10.29.78.23 minpoll  4 maxpoll  4

# Access control.
restrict 0.0.0.0     mask 0.0.0.0         ignore
restrict 127.0.0.1   mask 255.255.255.255
#
restrict 10.29.78.3  mask 255.255.255.255 nopeer nomodify notrap
restrict 10.29.78.1  mask 255.255.255.255 nopeer nomodify notrap
restrict 10.29.78.23 mask 255.255.255.255 nopeer nomodify notrap

# Miscellaneous.
disable auth
driftfile /etc/ntp.drift
statsdir /etc/ntpstats/
#filegen loopstats file loopstats type day nolink enable
#filegen peerstats file peerstats type day nolink enable

Comment 5 starlight 2013-10-11 17:31:54 UTC
Forgot to say this is for 32-bit RHEL 4.9.

Apparently 64-bit RHEL 4.9 is quite different.

Comment 7 Cole Robinson 2018-09-04 18:30:24 UTC
Since these timer defaults are not specific to the OS, this isn't really libosinfo's area to cover, but some higher level hypothetical library like libvirt-designer or virtuned


Note You need to log in before you can comment on or make changes to this bug.