Bug 1278324

Summary: CPU limitation for VM does not work when balloon device is not present
Product: [oVirt] mom Reporter: Artyom <alukiano>
Component: CoreAssignee: Martin Sivák <msivak>
Status: CLOSED DUPLICATE QA Contact: Ilanit Stein <istein>
Severity: high Docs Contact:
Priority: high    
Version: 0.5.1CC: alukiano, bugs, dfediuck, rgolan
Target Milestone: ovirt-4.0.1Flags: rgolan: ovirt-4.0.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-15 08:51:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1191119    
Attachments:
Description Flags
host logs
none
mom debug none

Description Artyom 2015-11-05 09:30:14 UTC
Created attachment 1089995 [details]
host logs

Description of problem:
CPU limitation not works on vm, it does not matter what cpu profile use vm.

Version-Release number of selected component (if applicable):
mom-0.5.1-1.el7ev.noarch
vdsm-4.17.10.1-0.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create CPU QoS with 10% limitation
2. Create CPU profile with CPU QoS above
3. Attach CPU profile above to vm(have number of cpu equal to half of hosts cpu's)
4. Load vm CPU to 100%

Actual results:
Host CPU loaded to 50%

Expected results:
Host CPU must be loaded only on 10%

Additional info:
I do not really sure if problem in MoM of in VDSM

Also I do not see at all parameters of period and quota under <cputune> parameter
<metadata xmlns:ovirt="http://ovirt.org/vm/tune/1.0">                                                                                                   
    <ovirt:qos xmlns:ovirt="http://ovirt.org/vm/tune/1.0">                                                                                                
        <ovirt:vcpuLimit>10</ovirt:vcpuLimit>
</ovirt:qos>
  </metadata>
  <maxMemory slots='16' unit='KiB'>4294967296</maxMemory>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static' current='12'>16</vcpu>
  <cputune>
    <shares>1020</shares>
  </cputune>

Comment 1 Martin Sivák 2015-11-06 10:32:43 UTC
I do not see the ready status for the VM in mom.log. That usually means some data is missing for mom to start the policy evaluation.

Comment 2 Martin Sivák 2015-11-06 10:33:44 UTC
Can you run the test with DEBUG logging enabled please?

Comment 3 Artyom 2015-11-08 08:04:52 UTC
Created attachment 1091225 [details]
mom debug

Comment 4 Martin Sivák 2015-11-10 14:21:03 UTC
So the issue is that balloon info is missing:

test_cpu_profile: Incomplete data: missing set(['balloon_max', 'balloon_cur', 'balloon_min'])

The balloonInfo should always be reported by VDSM. Can you attach the getVmStats output for the VM please?

Comment 5 Martin Sivák 2015-11-10 15:01:55 UTC
Ok so we found out what is wrong. This whole issue is caused by a missing memory balloon device which is currently required for any QoS to work properly.

Can you please check whether your VM had a memory ballon device enabled checkbox set in the VM edit / Resource allocation subtab?

Comment 6 Artyom 2015-11-10 15:43:14 UTC
You right, balloon device under vm disabled, if I enabled it I can see that period and quota appear under dumpxml:
<cputune>
    <shares>1020</shares>
    <period>100000</period>
    <quota>20000</quota>
</cputune> 

But I first time hear that QoS need balloon device to work, we just add vm to some cgroup with specific parameters on host, how it connect to balloon device?

Comment 7 Martin Sivák 2015-11-10 16:18:35 UTC
MOM requires the balloon device to be able to do the load computations based on memory. It might not be necessary in this case though so the bug might still be valid with a different title.

And btw, balloon was always required. I am lowering the severity and removing the regression keyword.

Comment 8 Sandro Bonazzola 2016-05-02 09:47:41 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 9 Yaniv Lavi 2016-05-23 13:12:44 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 10 Martin Sivák 2016-06-15 08:51:33 UTC

*** This bug has been marked as a duplicate of bug 1337834 ***