Bug 1574657

Summary: CPU on VM never reaches the defined in QOS limit
Product: [oVirt] ovirt-engine Reporter: Polina <pagranat>
Component: BLL.VirtAssignee: Michal Skrivanek <michal.skrivanek>
Status: CLOSED NOTABUG QA Contact: Polina <pagranat>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2.2CC: bugs, msivak
Target Milestone: ---Keywords: Automation
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-29 10:51:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs
none
dumpxml of vm with cpu profile limitation
none
VM stats from VDSM none

Description Polina 2018-05-03 19:02:13 UTC
Created attachment 1430865 [details]
logs

Description of problem: load CPU to maximum on the VM with attached CPU profile (limited by qos) never reaches the expected number. without CPU profile limitation, works correctly.

Version-Release number of selected component (if applicable): rhv-release-4.2.3-4-001.noarch

How reproducible: happens on some hosts.
host topology for example (could be found in compute-ge-he-2.qa.lab.tlv.redhat.com environment. host name alma05.qa.lab.tlv.redhat.com):

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            21
Model:                 2
Model name:            AMD Opteron(tm) Processor 6308
Stepping:              0
CPU MHz:               3500.000
CPU max MHz:           3500.0000
CPU min MHz:           1400.0000
BogoMIPS:              6982.16
Virtualization:        AMD-V
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
L3 cache:              6144K
NUMA node0 CPU(s):     0,2
NUMA node1 CPU(s):     4,6
NUMA node2 CPU(s):     1,3
NUMA node3 CPU(s):     5,7
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate retpoline_amd vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold


Steps to Reproduce:
1. Create QOS=40% CPU (Data center/OQS/CPU tab)
2. Create profile with this QOS (Cluster/CPU profile)
3. Configure VM with total vCPUs=CPUs on host=8 (VM/System tab). Core per virtual socket=1, Thread per socket =1.
   Add created CPU profile to the VM
   Run the VM
4. Run load CPU on the VM (or by 'dd if=/dev/zero of=/dev/null' or by infinite while cycle)

Actual results: Never reaches the limit (not even close)

Expected results: CPU on the VM must reach 40 %.

Additional info:logs attached

Comment 1 Polina 2018-05-15 20:26:00 UTC
Hi Martin, here is some additional description with top screens of Host and VM for two situations - without cpu profile (to see that maximum load occurs as expected) and with cpu profile where we don't reach the expected load of 40%.
 
The script for load: fulload() { dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null& }; fulload;

load vm without CPU profile limitation - cpu reaches ~100% as expected(both host and vm):
HOST:
top - 23:05:28 up  6:28,  1 user,  load average: 8.21, 4.73, 2.78
Tasks: 223 total,   2 running, 221 sleeping,   0 stopped,   0 zombie
%Cpu(s): 98.4 us,  1.3 sy,  0.0 ni,  0.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 49243268 total, 42561428 free,  5585276 used,  1096564 buff/cache
KiB Swap: 24708092 total, 24708092 free,        0 used. 43099532 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
12072 qemu      20   0 1939792 537652  14608 S 740.7  1.1  30:33.46 qemu-kvm                                                                                                                                       

VM:
 Tasks: 147 total,   9 running, 138 sleeping,   0 stopped,   0 zombie
%Cpu(s): 36.3 us, 62.1 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  1.7 st
KiB Mem :   941884 total,   645276both free,   123004 used,   173604 buff/cache
KiB Swap:  1048572 total,  1048572 free,        0 used.   634680 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
 1351 root      20   0  107948    608    512 R 100.0  0.1   2:10.95 dd                                                                                                                                             
 1353 root      20   0  107948    608    512 R 100.0  0.1   2:10.85 dd                                                                                                                                             
 1356 root      20   0  107948    608    512 R  99.7  0.1   2:10.97 dd                                                                                                                                             
 1354 root      20   0  107948    548    456 R  99.3  0.1   2:12.44 dd                                                                                                                                             
 1357 root      20   0  107948    612    512 R  98.3  0.1   2:11.12 dd                                                                                                                                             
 1350 root      20   0  107948    572    480 R  97.7  0.1   2:10.91 dd                                                                                                                                             
 1352 root      20   0  107948    612    512 R  96.3  0.1   2:12.17 dd                                                                                                                                             
 1355 root      20   0  107948    596    504 R  95.3  0.1   2:11.70 dd                                                                                                                                             
   
Put cpu profile limited to 40%:
HOST:
top - 22:50:09 up  6:13,  1 user,  load average: 1.05, 0.77, 2.10
Tasks: 222 total,   2 running, 220 sleeping,   0 stopped,   0 zombie
%Cpu(s): 29.4 us,  1.4 sy,  0.0 ni, 69.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 49243268 total, 42554860 free,  5605572 used,  1082836 buff/cache
KiB Swap: 24708092 total, 24708092 free,        0 used. 43079324 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
 7534 qemu      20   0 1937752 552436  14612 S 193.6  1.1   5:45.25 qemu-kvm                                                                                                                                       
13038 vdsm      20   0  464556  37120  10476 S   2.7  0.1   2:13.34 ovirt-ha-agent                                                                                                                                 
11259 vdsm       0 -20 4961016  98920  13656 S   2.1  0.2   5:13.54 vdsmd                                                                                                                                          
  787 vdsm      20   0 1127084  36756  10036 S   0.6  0.1   2:09.84 ovirt-ha-broker                                                                                                                                
  
VM:
top - 22:48:50 up 5 min,  1 user,  load average: 7.37, 2.73, 1.00
Tasks: 147 total,  10 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.4 us,  9.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si, 84.8 st
KiB Mem :   941884 total,   643032 free,   126248 used,   172604 buff/cache
KiB Swap:  1048572 total,  1048572 free,        0 used.   632180 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
 1309 root      20   0  107948    612    512 R  15.9  0.1   0:24.42 dd                                                                                                                                             
 1310 root      20   0  107948    604    508 R  15.9  0.1   0:24.28 dd                                                                                                                                             
 1315 root      20   0  107948    608    512 R  15.9  0.1   0:24.99 dd                                                                                                                                             
 1311 root      20   0  107948    612    512 R  15.6  0.1   0:24.63 dd     

##############
Please note, that we have now this environment available and the scenario is reproduced on host alma05.qa.lab.tlv.redhat.com with VM = 'golden_env_mixed_virtio_0'. script for load in placed in /root/load_cpu_vm1.sh
##############

Comment 4 Polina 2018-05-16 11:41:05 UTC
Created attachment 1437275 [details]
dumpxml of vm with cpu profile limitation

Comment 5 Martin Sivák 2018-05-16 11:54:31 UTC
Created attachment 1437299 [details]
VM stats from VDSM

I could not find the metadata section in the new dump either, but:

- I see the fields in VM stats (vcpuPeriod and vcpuQuota). Those clearly translate to 40%
- I also see those valued in the libvirt xml

  <cputune>
    <period>100000</period>
    <quota>40000</quota>
  </cputune>


So all the pieces did what they were supposed to and libvirt is configured to use the proper limits. The actual enforcing is out of our hands.