+++ This bug was initially created as a clone of Bug #1821199 +++ Trying to migrate a domain between two identical hosts with slightly different TSC frequency fails with: unsupported configuration: Requested TSC frequency 2133408000 Hz does not match host (2133406000 Hz) and TSC scaling is not supported by the host CPU Apparently it is possible to get the exact frequency on modern CPUs, but it is just an output of some calibration code, which means the frequency may differ slightly even on identical hosts. Version-Release number of selected component (if applicable): libvirt-6.0.0-16.el8 How reproducible:100% Steps to Reproduce: Find two identical hosts without TSC scaling support and slightly different TSC frequency. Both can be checked in virsh capabilities: virsh -r capabilities | grep "counter name='tsc'" <counter name='tsc' frequency='2300026000' scaling='no'/> Start a domain with <cpu mode='host-passthrough' check='none'> <feature policy='require' name='invtsc'/> </cpu> <clock offset='utc'> <timer name='tsc' frequency='$HOST_TSC_FREQUENCY'/> </clock> and try to migrate it to the other host. It is possible to reproduce this issue even with a single host and not involving migration. Just try to start a domain configured as shown above, but use TSC frequency which slightly differs from the host (e.g., $HOST_TSC_FREQUENCY - 10000) in the <timer> element. The domain will fail to start with the error from bug description. --- Additional comment from Milan Zamazal on 2020-04-09 14:37:06 UTC --- On one host: kernel 4.18.0-147.0.3.el8_1 -> 4.18.0-147.8.1.el8_1 systemd 239-18.el8_1.4 -> 239-29.el8 qemu 4.2.0-16 -> 4.2.0-17 libvirt 6.0.0-15 -> 6.0.0-16 On the other host: kernel 4.18.0-147.0.3.el8_1 -> 4.18.0-147.8.1.el8_1 systemd 239-18.el8_1.4 -> 239-28.el8 qemu 4.1.0-23 -> 4.2.0-17 libvirt 5.6.0-10 -> 6.0.0-16 After inspecting my environment and old logs I can see the problem in my environment is that on one of the hosts the reported TSC frequency has changed from 2133408000 Hz to 2133406000 Hz (and there is no TSC frequency scaling now or before). I tried to reboot the host and now the reported TSC frequency is 2133407000 Hz. So the reported frequency is nondeterministic and slightly variable, which causes the migration problem. --- Additional comment from Milan Zamazal on 2020-04-15 17:10:33 UTC --- Looking into linux/arch/x86/kernel/tsc.c and dmesg on my host, the TSC frequency value comes from calibration and the kernel tries to keep it within certain bounds of accuracy. It's possible to get an exact value from hardware info on modern Intel CPUs but in other cases it's only measurement. If my observations are correct, we can't expect to have exactly the same TSC frequency values in many cases, even on the same machine across reboots (as my host demonstrates). Now the question is how to deal with that fact in migrations. Do I assume correctly that libvirt would reject the VM on the destination if the TSC frequency specified in the domain XML wasn't the same as on the host? And can slightly different TSC frequencies cause any harm to HP VMs? One very rude and ugly way to deal with that would be to tolerate slight differences in Engine (assuming it's OK for HP VMs) and replace the TSC frequency in libvirt hook. Another way would be to restrict migrations to hardware providing exact TSC frequency info, which is perhaps too restrictive and quite confusing. Maybe libvirt could provide some assistance, which might be the best solution. What do you all think? --- Additional comment from Jiri Denemark on 2020-04-16 08:48:03 UTC --- Yes, libvirt compares the TSC frequency in domain XML with the frequency probed from the host and refuses to start the domain if they don't exactly match. Unfortunately, changing the TSC frequency during migration is forbidden too, libvirt explicitly checks the frequencies match in both the original and updated domain definition (either supplied by a parameter to the migration API or via a pre-migration hook). We need to check whether the strict match is really necessary by trying to create a domain with a TSC frequency which slightly differs from the host's frequency which does not support scaling. --- Additional comment from Jiri Denemark on 2020-04-29 15:06:22 UTC --- Marcelo, how do you think we should handle migration between two identical hosts which do not support TSC scaling, but the TSC frequency probed by the kernel differs a bit. Currently libvirt refuses to migrate a domain between these two hosts because of TSC frequency mismatch. --- Additional comment from Marcelo Tosatti on 2020-05-19 20:13:41 UTC --- (In reply to Milan Zamazal from comment #12) > After inspecting my environment and old logs I can see the problem in my > environment is that on one of the hosts the reported TSC frequency has > changed from 2133408000 Hz to 2133406000 Hz (and there is no TSC frequency > scaling now or before). I tried to reboot the host and now the reported TSC > frequency is 2133407000 Hz. So the reported frequency is nondeterministic > and slightly variable, which causes the migration problem. Jiri, KVM_SET_TSC_KHZ supports an error of 250 ppm (see tsc_tolerance_ppm and adjust_tsc_khz in arch/x86/kvm/x86.c in the kernel source). Can that code be added to libvirt as well? --- Additional comment from Jiri Denemark on 2020-05-21 20:18:51 UTC --- The domain XML shows invtsc is exposed to the guest. Thanks Marcelo for the reference to the kernel code: /* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */ static u32 __read_mostly tsc_tolerance_ppm = 250; module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR); static u32 adjust_tsc_khz(u32 khz, s32 ppm) { u64 v = (u64)khz * (1000000 + ppm); do_div(v, 1000000); return v; } static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz) { ... thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm); thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm); if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) { pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi); use_scaling = 1; } ... } So for the host TSC frequency 2133408000 Hz a domain can request anything within +/- 533 kHz of the host frequency. The only problem is that tsc_tolerance_ppm is a parameter of the kvm module. We have two options, either use the default value in libvirt and hope nobody changed it on the host or we can try setting the frequency via KVM and check the result. But looking at the kernel code I don't see how we could detect that setting TSC failed because of unsupported TSC scaling rather than some other reason. Anyway it seems the kernel is less strict and allows setting TSC frequency which is greater than host TSC frequency even without TSC scaling: static int set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz, bool scale) { ... /* TSC scaling supported? */ if (!kvm_has_tsc_control) { if (user_tsc_khz > tsc_khz) { vcpu->arch.tsc_catchup = 1; vcpu->arch.tsc_always_catchup = 1; return 0; } else { pr_warn_ratelimited("user requested TSC rate below hardware speed\n"); return -1; } } ... } I checked the code in QEMU and it calls KVM_SET_TSC_KHZ first and checks TSC frequencies only if the call fails. In other words, libvirt is the only part which needs fixing here because it is too strict.
Copying my comment from the original bug (https://bugzilla.redhat.com/show_bug.cgi?id=1821199#c28): Marcelo, it seems the behavior does not match how I would understand the code in QEMU and the kernel (in comment 26). On a host without TSC scaling QEMU fails to set the frequency unless it is exactly the same as reported by the kernel: From virsh capabilities: <counter name='tsc' frequency='2903993000' scaling='no'/> TSC requested 1 kHz below the host: $ /usr/bin/qemu-system-x86_64 -machine pc,accel=kvm -cpu host,invtsc=on,tsc-frequency=2903992000 qemu-system-x86_64: warning: TSC frequency mismatch between VM (2903992 kHz) and host (2903993 kHz), and TSC scaling unavailable qemu-system-x86_64: kvm_init_vcpu failed: Operation not supported TSC requested 1 kHz above the host: $ /usr/bin/qemu-system-x86_64 -machine pc,accel=kvm -cpu host,invtsc=on,tsc-frequency=2903994000 qemu-system-x86_64: warning: TSC frequency mismatch between VM (2903994 kHz) and host (2903993 kHz), and TSC scaling unavailable qemu-system-x86_64: kvm_init_vcpu failed: Operation not supported Exact TSC: $ /usr/bin/qemu-system-x86_64 -machine pc,accel=kvm -cpu host,invtsc=on,tsc-frequency=2903993000 # QEMU runs happily here The TSC tolerance is the default value (which corresponds to +/- 726 kHz interval around the host frequency on this particular host): # cat /sys/module/kvm/parameters/tsc_tolerance_ppm 250 I tried this on several hosts without TSC scaling and the behavior is the same everywhere. Did I misunderstand anything? Or am I just doing it all wrong?
Patch sent upstream for review: https://www.redhat.com/archives/libvir-list/2020-November/msg00519.html
This is fixed upstream by commit d8e5b4560006590668d4669f54a46b08ec14c1a2 Refs: v6.9.0-204-gd8e5b45600 Author: Jiri Denemark <jdenemar> AuthorDate: Mon May 25 11:35:12 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Thu Nov 12 17:29:16 2020 +0100 qemu: Do not require TSC frequency to strictly match host Some CPUs provide a way to read exact TSC frequency, while measuring it is required on other CPUs. However, measuring is never exact and the result may slightly differ across reboots. For this reason both Linux kernel and QEMU recently started allowing for guests TSC frequency to fall into +/- 250 ppm tolerance interval around the host TSC frequency. Let's do the same to avoid unnecessary failures (esp. during migration) in case the host frequency does not exactly match the frequency configured in a domain XML. https://bugzilla.redhat.com/show_bug.cgi?id=1839095 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Daniel Henrique Barboza <danielhb413>
Testing this feature with: libvirt-client-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64 qemu-kvm-5.2.0-2.module+el8.4.0+9186+ec44380f.x86_64 1. check host capabilities # virsh capabilities |grep tsc <counter name='tsc' frequency='2399996000' scaling='no'/> <feature name='tsc_adjust'/> <feature name='invtsc'/> 2. prepare a guest with the following xml definition ... <cpu mode='host-model' check='partial'> <feature policy='require' name='invtsc'/> </cpu> ... <clock offset='utc'> <timer name='tsc' frequency='2399396001'/> </clock> 3. start the guest # virsh start rhel8.4 error: Failed to start domain rhel8.4 error: unsupported configuration: Requested TSC frequency 2399396001 Hz is outside tolerance range ([2399396001, 2400595999] Hz) around host frequency 2399996000 Hz and TSC scaling is not supported by the host CPU 4. prepare a guest with the tsc frequency 2400595999Hz # virsh start rhel8.4 error: Failed to start domain rhel8.4 error: unsupported configuration: Requested TSC frequency 2400595999 Hz is outside tolerance range ([2399396001, 2400595999] Hz) around host frequency 2399996000 Hz and TSC scaling is not supported by the host CPU Hi, Jiri I am confused about the frequency boundaries. AFAIK, square brackets mean the end point is included. Please help to check.
Also tested the frequencies between 2399396002 and 23993960999 # virsh dumpxml rhel8.4 |grep tsc <feature policy='require' name='invtsc'/> <timer name='tsc' frequency='2399396999'/> # virsh start rhel8.4 error: Failed to start domain rhel8.4 error: internal error: qemu unexpectedly closed the monitor: 2020-12-21T12:04:25.865140Z qemu-kvm: warning: TSC frequency mismatch between VM (2399396 kHz) and host (2399996 kHz), and TSC scaling unavailable 2020-12-21T12:04:25.865236Z qemu-kvm: kvm_init_vcpu: kvm_arch_init_vcpu failed (0): Operation not supported Not hit the issue for the case near the upper boundary.
(In reply to Lili Zhu from comment #10) > I am confused about the frequency boundaries. AFAIK, square brackets mean > the end point is included. Oops, yes, the kernel code fails if freq < min or freq > max while I didn't include the boundaries when converting the code to succeed when within bounds. I'll fix this.
(In reply to Lili Zhu from comment #11) > Also tested the frequencies between 2399396002 and 23993960999 > # virsh dumpxml rhel8.4 |grep tsc > <feature policy='require' name='invtsc'/> > <timer name='tsc' frequency='2399396999'/> > > # virsh start rhel8.4 > error: Failed to start domain rhel8.4 > error: internal error: qemu unexpectedly closed the monitor: > 2020-12-21T12:04:25.865140Z qemu-kvm: warning: TSC frequency mismatch > between VM (2399396 kHz) and host (2399996 kHz), and TSC scaling unavailable > 2020-12-21T12:04:25.865236Z qemu-kvm: kvm_init_vcpu: kvm_arch_init_vcpu > failed (0): Operation not supported > > Not hit the issue for the case near the upper boundary. Unfortunately, asking for TSC frequency within the tolerance range around host frequency may not be enough to make it work. And there's no way to ask the kernel (other than trying to set the frequency) or even get a sensible error from the kernel about why it does not work. So QEMU just asks the kernel to set the guest frequency and reports the generic error when this fails. The important thing here it's not libvirt which is refusing to start the domain because of too strict requirements. Thus libvirt checks the interval to report a reasonable error when the TSC frequency is significantly off and leaving the rest for QEMU/KVM to deal with.
(In reply to Lili Zhu from comment #10) > error: unsupported configuration: Requested TSC frequency 2399396001 Hz is > outside tolerance range ([2399396001, 2400595999] Hz) around host frequency > 2399996000 Hz and TSC scaling is not supported by the host CPU > > error: unsupported configuration: Requested TSC frequency 2400595999 Hz is > outside tolerance range ([2399396001, 2400595999] Hz) around host frequency > 2399996000 Hz and TSC scaling is not supported by the host CPU This is now fixed upstream by commit f7c40b5c716fea5d2a4179569146307ebebc76ba Refs: v6.10.0-309-gf7c40b5c71 Author: Jiri Denemark <jdenemar> AuthorDate: Tue Jan 5 23:53:25 2021 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Wed Jan 6 11:24:37 2021 +0100 qemu: The TSC tolerance interval should be closed The kernel refuses to set guest TSC frequency less than a minimum frequency or greater than maximum frequency (both computed based on the host TSC frequency). When writing the libvirt code with a reversed logic (return success when the requested frequency falls within the tolerance interval) I forgot to include the boundaries. Fixes: d8e5b4560006590668d4669f54a46b08ec14c1a2 https://bugzilla.redhat.com/show_bug.cgi?id=1839095 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Peter Krempa <pkrempa>
1. prepare 2 hosts have cpu feature "invtsc", not supporting TSC scaling. The tsc frequency of the 2 hosts are slightly different. host A: # virsh capabilities |grep counter <counter name='tsc' frequency='2399996000' scaling='no'/> host B: # virsh capabilities |grep counter <counter name='tsc' frequency='2399997000' scaling='no'/> 2. prepare a guest with the following xml definition on host A # virsh dumpxml rhel8.4 ... <cpu mode='host-model' check='partial'> <feature policy='require' name='invtsc'/> </cpu> <clock offset='utc'> <timer name='tsc' frequency='2399996000'/> </clock> ... 3. start the guest, then migrate the guest to host B # virsh migrate rhel8.4 qemu+ssh://dell-per730-36.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent --verbose Migration: [100 %] 4. check the guest state # virsh list --all Id Name State ------------------------- 2 rhel8.4 running 5. check the guest xml # virsh dumpxml rhel8.4 .... <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Haswell-noTSX-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='vme'/> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='f16c'/> <feature policy='require' name='rdrand'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaveopt'/> <feature policy='require' name='pdpe1gb'/> <feature policy='require' name='abm'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='invtsc'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> <timer name='tsc' frequency='2399996000'/> </clock> ... migration succeed.
As the guest migration between two identical hosts without TSC scaling support and slightly different TSC frequency succeed now, mark the bug as verified. Will track the issue in Comment 10 in another bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098