The linux kernel was updated: https://lkml.org/lkml/2020/3/20/1030 to include steal{time,clock} accounting This would greatly assist in troubleshooting vSphere performance issues caused by over-provisioned ESXi hosts.
This bug has been verified pre-merge and passed. Steps: 1, Install cluster on vSphere with above 2 PR 2, Once control nodes and worker nodes are ready, check steal time is correctly enabled. In vSphere web console, check machines 'Edit Settings' -> 'VM Options' -> 'Advanced' -> 'Configuration Parameters' stealclock.enable TRUE In control/worker nodes, check st running top command. $ oc debug node/sgao-v-b7hkt-master-0 ... sh-4.4# top top - 22:30:42 up 1:19, 0 users, load average: 1.38, 0.91, 0.87 Tasks: 364 total, 1 running, 363 sleeping, 0 stopped, 0 zombie %Cpu(s): 7.1 us, 7.1 sy, 0.0 ni, 81.4 id, 1.4 wa, 1.4 hi, 0.0 si, 1.4 st MiB Mem : 16019.3 total, 1903.0 free, 6177.2 used, 7939.1 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 9414.4 avail Mem $ oc debug node/sgao-v-b7hkt-worker-fhrg8 ... sh-4.4# top top - 22:28:44 up 1:07, 0 users, load average: 0.70, 0.99, 1.00 Tasks: 336 total, 1 running, 335 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.2 us, 1.4 sy, 0.0 ni, 94.4 id, 0.0 wa, 0.2 hi, 0.3 si, 0.5 st MiB Mem : 15869.0 total, 4076.7 free, 4080.9 used, 7711.4 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 11361.9 avail Mem
Since PR is already merged, set BZ status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399