Bug 2189960

Summary: NodeSelector for tsc frequency does not tolerate small TSC variations
Product: Container Native Virtualization (CNV) Reporter: Jed Lejosne <jlejosne>
Component: VirtualizationAssignee: Jed Lejosne <jlejosne>
Status: CLOSED ERRATA QA Contact: Denys Shchedrivyi <dshchedr>
Severity: high Docs Contact:
Priority: urgent    
Version: 4.13.0CC: acardace, dshchedr, fdeutsch, gveitmic, kbidarka, pelauter, sgott
Target Milestone: ---   
Target Release: 4.13.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: v4.13.1.rhel9-79 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2184860 Environment:
Last Closed: 2023-06-20 13:41:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2184860, 2189961    
Bug Blocks:    

Description Jed Lejosne 2023-04-26 15:10:29 UTC
+++ This bug was initially created as a clone of Bug #2184860 +++

Description of problem:

The node labeller marks the nodes with: 
* their exact TSC frequency
* the lowest TSC frequency in the cluster IF they support tsc-scalable

Hypothetical example, on a cluster where the lowest frequency is X

NAME   TSC-FREQUENCY  TSC-SCALABLE  TSC-FREQUENCY-X  TSC-FREQUENCY-Y
node1  X              true          true              
node2  Y              true          true             true
node3  Y              false                          true

However, TSC scaling may not be an exact number in many CPU models, leaving some possible variation.

See this, same CPU on 2 different systems (or even same system and 2 reboots), there is a 1 Mhz difference that can show up between reboots or systems with same CPU

[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2297.345 MHz processor
[    4.127014] tsc: Refined TSC clocksource calibration: 2297.339 MHz

[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2297.449 MHz processor
[    4.063010] tsc: Refined TSC clocksource calibration: 2297.338 MHz

If we do X = 2297.338 
         Y = 2297.339
Then a Windows VM with re-enlightenment will never run on node3, because its missing TSC-FREQUENCY-X label by 1MHZ off.
The logic will consider this an heterogeneous cluster, but its not.

The system should be able to schedule VMs on any of those 3 nodes, regardless of TSC-SCALABLE or not. Because these are essentially the same frequency.
Lower layers accept this variance, BZ1839095

Version-Release number of selected component (if applicable):
4.12.10

How reproducible:
Always

Steps to Reproduce:
1. Use systems with TSC-SCALABLE = false and same CPUs, reboot until different frequencies.

Actual results:
* VMs fail to schedule on nodes with same CPU

Expected results:
* VMs scheduled

--- Additional comment from Germano Veit Michel on 2023-04-06 01:57:56 UTC ---



--- Additional comment from Fabian Deutsch on 2023-04-11 12:51:27 UTC ---

Adjusting priority because it relates to a customer case.
Urgent, because it will impact GS.

--- Additional comment from  on 2023-04-12 12:03:59 UTC ---

Comment 1 Kedar Bidarkar 2023-05-29 12:25:13 UTC
*** Bug 2186213 has been marked as a duplicate of this bug. ***

Comment 2 Denys Shchedrivyi 2023-06-01 21:33:19 UTC
Verified on CNV-v4.13.1.rhel9-79

Comment 8 errata-xmlrpc 2023-06-20 13:41:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 4.13.1 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:3686