RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1641702 - check tsc scaling fea-ture of destination host on migration
Summary: check tsc scaling fea-ture of destination host on migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.6
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: jiyan
URL:
Whiteboard:
Depends On:
Blocks: 1600168 1648273
TreeView+ depends on / blocked
 
Reported: 2018-10-22 14:40 UTC by Marcelo Tosatti
Modified: 2023-03-16 09:59 UTC (History)
9 users (show)

Fixed In Version: libvirt-4.5.0-21.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1648273 (view as bug list)
Environment:
Last Closed: 2019-08-06 13:14:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2294 0 None None None 2019-08-06 13:14:45 UTC

Description Marcelo Tosatti 2018-10-22 14:40:09 UTC
To enable TSC clocksource to guests
(see BZ 1617998), it is necessary to expose
invariant TSC to guests.

Now, for migration of guests with invariant TSC to work, it
is necessary that the destination host:

1) Has the same TSC frequency as the source host.

or

2) Supports TSC scaling feature (which can be verified by
vmxcap script found at qemu source code).

So, if the invariant TSC bit is exposed to the guest, 
libvirt must check for 1) or 2) above, failing migration
otherwise.

Comment 5 Jiri Denemark 2019-05-31 12:47:46 UTC
Patches sent upstream for review:
https://www.redhat.com/archives/libvir-list/2019-May/msg00912.html

Comment 6 Jiri Denemark 2019-06-03 16:25:16 UTC
The patches are pushed upstream now:

commit dd3fc650de8ef8b05b491c9f362b660e07a857fd
Refs: v5.4.0-33-gdd3fc650de
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jun 3 13:13:38 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	qemu: Make virQEMUCapsProbeHostCPUForEmulator more generic

    The function is renamed as virQEMUCapsProbeHostCPU and it does not get
    the list of allowed CPU models from qemuCaps anymore. This is
    responsibility is moved to the caller. The result is just a very thin
    wrapper around virCPUGetHost mostly required mocking in tests.

    The generic function is used in place of a direct call to virCPUGetHost
    in virQEMUCapsInitHostCPUModel to make sure tests don't accidentally
    probe host CPU.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 02c1d3a6e1d24a777254f4dceeaf54942db7f871
Refs: v5.4.0-34-g02c1d3a6e1
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jun 3 13:15:19 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	qemuargv2xmltest: Use mocked virQEMUCapsProbeHostCPU

    The qemuTestParseCapabilitiesArch call would eventually lead to the host
    CPU being probed via virCPUGetHost. Let's divert this to a mocked
    version already used by the qemuxml2argvtest.

    Signed-off-by: Jiri Denemark <jdenemar>

commit f0f6faba63becfab38c928905ac6ed79f9a318b8
Refs: v5.4.0-35-gf0f6faba63
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu May 30 16:34:59 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	util: Add virHostCPUGetTscInfo

    On a KVM x86_64 host which supports invariant TSC this function can be
    used to detect the TSC frequency and the availability of TSC scaling.

    The magic MSR numbers required to check if VMX scaling is supported on
    the host are documented in Volume 3 of the Intel® 64 and IA-32
    Architectures Software Developer’s Manual.

    Signed-off-by: Jiri Denemark <jdenemar>

commit c277b9ad5c740bb4c4b915754ae74621f93f9d37
Refs: v5.4.0-36-gc277b9ad5c
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu May 30 21:47:49 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	conf: Report TSC frequency in host CPU capabilities

    This patch adds a new

        <counter name='tsc' frequency='N' scaling='on|off'/>

    element into the host CPU capabilities XML.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 32f577ab10aefda6c4666abd07814c5c39f57788
Refs: v5.4.0-37-g32f577ab10
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Apr 16 13:24:45 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	cpu_x86: Fix placement of *CheckFeature functions

    Commit 0a97486e09 moved them outside #ifdef, but after virCPUx86GetHost,
    which will start calling them in the following patch.

    Signed-off-by: Jiri Denemark <jdenemar>

commit ceb04d15e671b4fea1d674ee43c91410da9fe57d
Refs: v5.4.0-38-gceb04d15e6
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu May 30 21:47:38 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	cpu_x86: Probe TSC frequency and scaling support

    When the host CPU supports invariant TSC the host CPU definition created
    by virCPUx86GetHost will contain (unless probing fails for some reason)
    addition TSC related data.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 7da62c91f043209e3d40c2dc7655c5e35a4309bf
Refs: v5.4.0-39-g7da62c91f0
Author:     Jiri Denemark <jdenemar>
AuthorDate: Fri May 31 00:03:59 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Jun 3 18:07:16 2019 +0200

	qemu: Check TSC frequency before starting QEMU

    When migrating a domain with invtsc CPU feature enabled, the TSC
    frequency of the destination host must match the frequency used when the
    domain was started on the source host or the destination host has to
    support TSC scaling.

    If the frequencies do not match and the destination host does not
    support TSC scaling, QEMU will fail to set the right TSC frequency when
    starting vCPUs on the destination and thus migration will fail. However,
    this is quite late since both host might have spent significant time
    transferring memory and perhaps even storage data.

    By adding the check to libvirt we can let migration fail before any data
    starts to be sent over. If for some reason libvirt is unable to detect
    the host's TSC frequency or scaling support, we'll just let QEMU try and
    the migration will either succeed or fail later.

    Luckily, we mandate TSC frequency to be explicitly set in the domain XML
    to even allow migration of domains with invtsc. We can just check
    whether the requested frequency is compatible with the current host
    before starting QEMU.

    https://bugzilla.redhat.com/show_bug.cgi?id=1641702

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 9 jiyan 2019-06-06 06:55:53 UTC
Hi jiri it seems that virsh hypervisor-cpu-baseline/ virsh cpu-baseline can not get the right cpu baseline through 'virsh capabilities' because of "TSC frequency".


Version:
libvirt-4.5.0-20.el7.x86_64
kernel-3.10.0-1053.el7.x86_64
qemu-kvm-rhev-2.12.0-31.el7.x86_64


Steps:
1. Obtain the output of 'virsh capabilities'
# virsh capabilities >> new

# virsh capabilities
<capabilities>

  <host>
    <uuid>30333735-3938-4e43-4732-323053424b53</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>Opteron_G3</model>
      <vendor>AMD</vendor>
      <microcode version='16777433'/>
      <counter name='tsc' frequency='2000038000'/>

2. Using 'virsh hypervisor-cpu-baseline'/'virsh cpubaseline' to get the Cpu baseline based on new file
# virsh hypervisor-cpu-baseline 66
error: unsupported configuration: Invalid TSC frequency

# virsh cpu-baseline 66
error: unsupported configuration: Invalid TSC frequency

Actual result:
"virsh hypervisor-cpu-baseline"/"virsh cpu-baseline" failed because of TSC frequency.

Additional info:
These two cmds both can accept the output of "virsh capabilities" as input file to get cpu baseline.

       hypervisor-cpu-baseline FILE [virttype] [emulator] [arch] [machine] [--features] [--migratable]
           Compute a baseline CPU which will be compatible with all CPUs defined in an XML file and with the CPU the hypervisor is able to provide on the host. (This is different from cpu-
           baseline which does not consider any hypervisor abilities when computing the baseline CPU.)

           The XML FILE may contain either host or guest CPU definitions describing the host CPU model. The host CPU definition is the <cpu> element and its contents as printed by
           capabilities command. The guest CPU definition may be created from the host CPU model found in domain capabilities XML (printed by domcapabilities command). In addition to the
           <cpu> elements, this command accepts full capabilities XMLs, or domain capabilities XMLs containing the CPU definitions. For best results, use only the CPU definitions from domain
           capabilities.


       cpu-baseline FILE [--features] [--migratable]
           Compute baseline CPU which will be supported by all host CPUs given in <file>.  (See hypervisor-cpu-baseline command to get a CPU which can be provided by a specific hypervisor.)
           The list of host CPUs is built by extracting all <cpu> elements from the <file>. Thus, the <file> can contain either a set of <cpu> elements separated by new lines or even a set of
           complete <capabilities> elements printed by capabilities command.  If --features is specified, then the resulting XML description will explicitly include all features that make up
           the CPU, without this option features that are part of the CPU model will not be listed in the XML description.   If --migratable is specified, features that block migration will
           not be included in the resulting CPU.

Comment 10 Jiri Denemark 2019-06-06 07:35:30 UTC
Oops, this is a bug in the code which parses CPU definition from capabilities
XML. I just sent the fix upstream for review:
https://www.redhat.com/archives/libvir-list/2019-June/msg00152.html

Comment 11 Jiri Denemark 2019-06-06 08:36:35 UTC
Fixed upstream by

commit 4d21d4acf2eac961b8c25f1ec49a9c25f3951fdb
Refs: v5.4.0-51-g4d21d4acf2
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Jun 6 09:29:38 2019 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Thu Jun 6 09:40:40 2019 +0200

    cpu_conf: Fix XPath for parsing TSC frequency

    Due to this bug the following command would fail on any host where TSC
    frequency can be probed:

       	$ virsh capabilities | virsh cpu-baseline /dev/stdin
       	error: unsupported configuration: Invalid TSC frequency

    https://bugzilla.redhat.com/show_bug.cgi?id=1641702

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Ján Tomko <jtomko>

Comment 13 jiyan 2019-06-14 03:06:05 UTC
Verified this bug in libvirt-4.5.0-22.el7.x86_64

Version:
libvirt-4.5.0-22.el7.x86_64
kernel-3.10.0-1053.el7.x86_64
qemu-kvm-rhev-2.12.0-32.el7.x86_64

Steps:
Scenario-1: Check the output of "virsh capabilities" and compare/baseline cpu
S1. Check the output of "virsh capabilities" when scaling=yes and compare/baseline cpu
# virsh capabilities
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

# virsh capabilities > cap.xml

# virsh hypervisor-cpu-baseline cap.xml 
# virsh hypervisor-cpu-compare cap.xml 
# virsh cpu-compare cap.xml
# virsh cpu-baseline cap.xml
==> No innormal err for the cmds above

S2. Check the output of "virsh capabilities" when scaling=no and compare/baseline cpu
# virsh capabilities
      <counter name='tsc' frequency='2397223000' scaling='no'/>

# virsh capabilities >> cap.xml

# virsh hypervisor-cpu-baseline cap.xml 
# virsh hypervisor-cpu-compare cap.xml 
# virsh cpu-compare cap.xml
# virsh cpu-baseline cap.xml
==> No innormal err for the cmds above


Scenario-2: Configure tsc related XML for VM with different value of frequency
S1. Configure the value less than the frequency in the output of "virsh capabilities" when scaling=yes
# virsh capabilities |more
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

# virsh domstate vm
shut off

# virsh dumpxml vm --inactive |grep "<clock" -A10
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='1000000000'/>
  </clock>

# virsh start vm
Domain vm started

# ps -ef |grep vm
qemu      70339      1 99 22:32 ?        00:02:15 /usr/libexec/qemu-kvm -name guest=vm
...
-cpu Skylake-Server-IBRS,ds=off,acpi=off,ss=on,ht=off,tm=off,pbe=off,dtes64=off,ds_cpl=off,vmx=off,smx=off,est=off,tm2=off,xtpr=off,pdcm=off,dca=off,osxsave=off,tsc_adjust=on,clflushopt=on,intel-pt=off,pku=on,ospke=off,md-clear=on,stibp=on,ssbd=on,xsaves=off,invtsc=on,hypervisor=on,tsc-frequency=1000000000

S2. Configure the value more than the frequency in the output of "virsh capabilities" when scaling=yes
# virsh capabilities |more
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

# virsh domstate vm
shut off

# virsh dumpxml vm --inactive |grep "<clock" -A10
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='3000000000'/>
  </clock>

# virsh start vm
Domain vm started

# ps -ef |grep vm
qemu      70339      1 99 22:32 ?        00:02:15 /usr/libexec/qemu-kvm -name guest=vm
...
-cpu Skylake-Server-IBRS,ds=off,acpi=off,ss=on,ht=off,tm=off,pbe=off,dtes64=off,ds_cpl=off,vmx=off,smx=off,est=off,tm2=off,xtpr=off,pdcm=off,dca=off,osxsave=off,tsc_adjust=on,clflushopt=on,intel-pt=off,pku=on,ospke=off,md-clear=on,stibp=on,ssbd=on,xsaves=off,invtsc=on,hypervisor=on,tsc-frequency=3000000000

S3. Configure the value equals the frequency in the output of "virsh capabilities" when scaling=yes
# virsh capabilities |more
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

# virsh domstate q35771
shut off

# virsh dumpxml q35771 --inactive
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2095078000'/>
  </clock>

# virsh start q35771
Domain q35771 started

# ps -ef |grep q35771
-cpu Skylake-Server-IBRS,ds=on,acpi=on,ss=on,ht=on,tm=on,pbe=on,dtes64=on,monitor=on,ds_cpl=on,vmx=on,smx=on,est=on,tm2=on,xtpr=on,pdcm=on,dca=on,osxsave=on,tsc_adjust=on,clflushopt=on,intel-pt=on,pku=on,ospke=on,md-clear=on,stibp=on,ssbd=on,xsaves=on, ** invtsc=on,tsc-frequency=2095078000 **

S4. Configure the value less or more than the frequency in the output of "virsh capabilities" when scaling=no
# virsh capabilities
      <counter name='tsc' frequency='2397223000' scaling='no'/>

# virsh domstate vmq35_771
shut off

# virsh dumpxml vmq35_771 --inactive |grep "<clock" -A10
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='1000000000'/>
  </clock>

# virsh start vmq35_771
error: Failed to start domain vmq35_771
error: unsupported configuration: Requested TSC frequency 1000000000 Hz does not match host (2397223000 Hz) and TSC scaling is not supported by the host CPU

# virsh dumpxml vmq35_771 --inactive |grep "<clock" -A10
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2397223111'/>
  </clock>

# virsh start vmq35_771
error: Failed to start domain vmq35_771
error: unsupported configuration: Requested TSC frequency 2397223111 Hz does not match host (2397223000 Hz) and TSC scaling is not supported by the host CPU

S5. Configure the value equals than the frequency in the output of "virsh capabilities" when scaling=no
# virsh capabilities
      <counter name='tsc' frequency='2397223000' scaling='no'/>

# virsh domstate vmq35_771
shut off

# virsh dumpxml vmq35_771 --inactive |grep "<clock" -A5
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2397223000'/>
  </clock>

# virsh start vmq35_771
Domain vmq35_771 started

# ps -ef |grep vmq35_771
qemu      8781     1 99 22:51 ?        00:00:10 /usr/libexec/qemu-kvm -name guest=vmq35_771
...
-cpu Penryn,vme=on,ss=on,x2apic=on,tsc-deadline=on,xsave=on,hypervisor=on,arat=on,tsc_adjust=on,tsc-frequency=2397223000

Comment 14 jiyan 2019-06-14 03:15:46 UTC
Scenario-3: Migrate VM in RHEL-7.7 host to RHEL-7.7 host with scaling=yes/no (the frequency in src and dst host is different.)
S1: Migrate VM in RHEL-7.7 host to RHEL-7.7 host with scaling=yes (the frequency in src and dst host is different.)
1. Start the VM in src host and migrate the vm to dst host
# virsh capabilities |grep counter
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

# virsh domstate vm
shut off

# virsh dumpxml vm --inactive |grep "<clock" -A5
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2095078000'/>
  </clock>

# virsh start vm
Domain vm started

# virsh migrate vm qemu+ssh://dsthost/system --live --postcopy --postcopy-after-precopy --p2p --verbose --copy-storage-all
Migration: [100 %]

2. Check the vm status in dst host
# virsh capabilities |grep counter
      <counter name='tsc' frequency='1696014000' scaling='yes'/>

# virsh domstate vm
running

# virsh dumpxml vm |grep "<clock" -A5
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2095078000'/>
  </clock>

# ps -ef |grep vm
qemu       3770      1 99 23:03 ?        00:25:21 /usr/libexec/qemu-kvm -name guest=vm ... -cpu Skylake-Server-IBRS,ds=off,acpi=off,ss=on,ht=off,tm=off,pbe=off,dtes64=off,ds_cpl=off,vmx=off,smx=off,est=off,tm2=off,xtpr=off,pdcm=off,dca=off,osxsave=off,tsc_adjust=on,clflushopt=on,intel-pt=off,pku=on,ospke=off,md-clear=on,stibp=on,ssbd=on,xsaves=off,invtsc=on,hypervisor=on,tsc-frequency=2095078000

S2: Migrate VM in RHEL-7.7 host to RHEL-7.7 host with scaling=no (the frequency in src and dst host is different.)
1. Start the VM in src host and migrate the vm to dst host
(src host info) # virsh capabilities |grep counter
      <counter name='tsc' frequency='2095078000' scaling='yes'/>

(dst host info) # virsh capabilities |grep "<counter"
      <counter name='tsc' frequency='2397223000' scaling='no'/>

# virsh domstate vm
shut off

# virsh dumpxml vm --inactive |grep "<clock" -A5
  <clock offset='utc'>
    ...
    <timer name='tsc' frequency='2095078000'/>
  </clock>

# virsh start vm
Domain vm started

# virsh migrate vm qemu+ssh://hp-dl380g9-02.lab.eng.pek2.redhat.com/system --live --postcopy --postcopy-after-precopy --p2p --verbose --copy-storage-all
error: unsupported configuration: Requested TSC frequency 2095078000 Hz does not match host (2397223000 Hz) and TSC scaling is not supported by the host CPU

Comment 15 jiyan 2019-06-14 03:52:10 UTC
In Scenario-3, the vm is configured with cpu feature "invtsc".

Comment 16 jiyan 2019-06-14 04:04:17 UTC
Add another scenario: Migrate VM with cpu feature "invtsc" (which is supported in RHEL-7.4) and tsc timer in RHEL-7.6.z to RHEL7.7 (with scaling=yes and scaling=no).
The result is same with https://bugzilla.redhat.com/show_bug.cgi?id=1641702#c14.

So all the results are as expected, move this bug to be verfified.

Comment 18 errata-xmlrpc 2019-08-06 13:14:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2294


Note You need to log in before you can comment on or make changes to this bug.