Description of problem: I created two cpu profiles, one with QoS that have limitation value 50 and second that have limitation value 25, and I attach this profiles to vm one by one, but I see that values for quota and period under dumpxml stay the same. I set bug under mom, because I know that we use mom policy to apply quota and period for vm. Version-Release number of selected component (if applicable): rhevm-3.5.0-0.27.el6ev.noarch vdsm-4.16.8.1-4.el7ev.x86_64 mom-0.4.1-4.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. Create two Cpu QoS under the same datacenter one with limitation value 25 and second with 50 <qoss> <qos type="cpu" href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916/qoss/2571eeea-25a7-4b09-9c37-d82591733f26" id="2571eeea-25a7-4b09-9c37-d82591733f26"> <name>test_1</name> <data_center href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916" id="f8f5eaee-8fd0-4b45-87db-62d61b03a916"/> <cpu_limit>50</cpu_limit> </qos> <qos type="cpu" href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916/qoss/883d876e-2038-4cd4-8c35-e9b52f2f4380" id="883d876e-2038-4cd4-8c35-e9b52f2f4380"> <name>test_2</name> <data_center href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916" id="f8f5eaee-8fd0-4b45-87db-62d61b03a916"/> <cpu_limit>25</cpu_limit> </qos> </qoss> 2. Create two cpu profile with different QoS in the same cluster <cpu_profiles> <cpu_profile href= "/ovirt-engine/api/cpuprofiles/5be5c0b7-5b91-4ac4-9d53-ef6f987bff05" id="5be5c0b7-5b91-4ac4-9d53-ef6f987bff05"> <name>test_1</name> <qos href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916/qoss/2571eeea-25a7-4b09-9c37-d82591733f26" id="2571eeea-25a7-4b09-9c37-d82591733f26"/> <cluster href= "/ovirt-engine/api/clusters/67866b36-fd68-4106-8758-34cf31b0c3d4" id="67866b36-fd68-4106-8758-34cf31b0c3d4"/> </cpu_profile> <cpu_profile href= "/ovirt-engine/api/cpuprofiles/b015da68-b7a5-4a4b-8389-5cbc8ce58f73" id="b015da68-b7a5-4a4b-8389-5cbc8ce58f73"> <name>test_2</name> <qos href= "/ovirt-engine/api/datacenters/f8f5eaee-8fd0-4b45-87db-62d61b03a916/qoss/883d876e-2038-4cd4-8c35-e9b52f2f4380" id="883d876e-2038-4cd4-8c35-e9b52f2f4380"/> <cluster href= "/ovirt-engine/api/clusters/67866b36-fd68-4106-8758-34cf31b0c3d4" id="67866b36-fd68-4106-8758-34cf31b0c3d4"/> </cpu_profile> </cpu_profiles> 3. Create some vm, and run it first with first QoS and after with second. First run: <cpu_profile href= "/ovirt-engine/api/cpuprofiles/5be5c0b7-5b91-4ac4-9d53-ef6f987bff05" id="5be5c0b7-5b91-4ac4-9d53-ef6f987bff05"/> dumpxml <vcpu placement='static' current='4'>32</vcpu> <cputune> <shares>1020</shares> <period>12500</period> <quota>25000</quota> </cputune> Second run: <cpu_profile href= "/ovirt-engine/api/cpuprofiles/b015da68-b7a5-4a4b-8389-5cbc8ce58f73" id="b015da68-b7a5-4a4b-8389-5cbc8ce58f73"/> dumpxml <vcpu placement='static' current='4'>32</vcpu> <cputune> <shares>1020</shares> <period>12500</period> <quota>25000</quota> </cputune> Actual results: <quota> and <period> under cpu_tunning have the same values under dumpxml, for different limitation value Expected results: <quota> and <period> have different values under different limitations Additional info: ok, I will start with this that not really understand why we use this kind of formula: period = anchor / #NumOfCpuInHost quota = (anchor*(#userSelection/100)) / #numOfVcpusInVm why we need this anchor, and why we change period time(default 1000000) I played a little with values of period and quota and virsh create, and limitation work pretty well for formula: period = default_value quota = period * (pcpu/vcpu) * (limitation/100) I check it on vm with 4 cpu's and on host with 8 cpu's with limitation 10, 25 and 50 From some reason the same proportion, but with small period work not precious or not work at all(seems to me like bug in cgroups)
RHEL 7 uses libvirt-1.1.1 and the metadata xml feature that is needed for this to work seems to be missing from that version. I was told that it was originally included to libvirt-1.1.3 which did not make it into RHEL 7. So currently the quota is always treated as 100% on RHEL 7 and the computed numbers (quota 25000, period 12500) cause no cpu usage throttling at all. RHEL 6.6 should have the necessary libvirt feature backported and should therefore work properly. Artyom: can you please retest with RHEL 6.6 hosts?
I should add that I saw the proper values to bubble through VDSM APIs so it is really only an issue with the: Thread-4352::DEBUG::2015-01-07 16:47:01,450::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.updateVmPolicy' in bridge with {u'params': {u'vmId': u'4d7aa507-1b32-4618-a5a2-884500dbbbc1', u'vcpuLimit': u'2'}, u'vmID': u'4d7aa507-1b32-4618-a5a2-884500dbbbc1'} Thread-4352::DEBUG::2015-01-07 16:47:01,454::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 74 edom: 10 level: 2 message: argument unsupported: QEMU driver does not support <metadata> element Thread-4352::ERROR::2015-01-07 16:47:01,454::vm::3821::vm.Vm::(_getVmPolicy) vmId=`4d7aa507-1b32-4618-a5a2-884500dbbbc1`::getVmPolicy failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 3818, in _getVmPolicy METADATA_VM_TUNE_URI, 0) File "/usr/share/vdsm/virt/vm.py", line 689, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 942, in metadata if ret is None: raise libvirtError ('virDomainGetMetadata() failed', dom=self) libvirtError: argument unsupported: QEMU driver does not support <metadata> element
No reason to block RC on a wrong libvirt version.
what's the libvirt dependency? do we expect 7.0.z update? if so, when?
for rhel6.6 it also not work: Thread-131338::DEBUG::2015-01-11 12:35:18,673::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 80 edom: 20 level: 2 message: metadata not found: Requested metadata element is not present Thread-131338::ERROR::2015-01-11 12:35:18,675::__init__::493::jsonrpc.JsonRpcServer::(_serveRequest) Internal server error Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/yajsonrpc/__init__.py", line 488, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 284, in _dynamicMethod return self._fixupRet(className, methodName, ret) File "/usr/share/vdsm/rpc/Bridge.py", line 234, in _fixupRet self._typeFixup('return', retType, result) File "/usr/share/vdsm/rpc/Bridge.py", line 214, in _typeFixup if k in item: TypeError: argument of type 'NoneType' is not iterable So it also not receive limit from engine. vdsm-4.16.8.1-5.el6ev.x86_64 libvirt-0.10.2-46.el6_6.2.x86_64
Actual only for RHEL6.6 After one minute I see that parameter updated to correct value, so error above not correct to QoS I also see that metadata passed correct: <metadata> <ovirt:qos xmlns:ovirt="http://ovirt.org/vm/tune/1.0"> <ovirt:vcpuLimit>10</ovirt:vcpuLimit> </ovirt:qos> And period and quota have correct values: <period>12500</period> <quota>2500</quota> tested for 5, 10, 25 and 50 percents
I see that for error above we already have bug: https://bugzilla.redhat.com/show_bug.cgi?id=1142851
*** Bug 1179591 has been marked as a duplicate of this bug. ***
Moving to MODIFIED to wait for a relevant RHEL version with a new libvirt version.
Moving to POST on eedri's request. It should be moved to MODIFIED once the libvirt version is available.
Works for me on these components: mom-0.4.1-4.el7ev.noarch libvirt-client-1.1.1-29.el7_0.7.x86_64 sanlock-3.1.0-2.el7.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.11.x86_64 vdsm-4.16.8.1-6.el7ev.x86_64 rhevm-3.5.0-0.31.el6ev.noarch RHEVH7.0 with these components not working: qemu-kvm-rhev-1.5.3-60.el7_0.11.x86_64 sanlock-3.1.0-2.el7.x86_64 mom-0.4.1-4.el7ev.noarch vdsm-4.16.8.1-6.el7ev.x86_64 libvirt-client-1.1.1-29.el7_0.4.x86_64 Please align RHEVHs to libvirt-client-1.1.1-29.el7_0.7.x86_64 or above.
doron, libvirt errata is shipped live, can this bug move to ON_QA?
On RHEL7.1 CPU SLA QOS is not working: vdsClient -s 0 list table virsh -r dumpxml StressVM1_CPU_RHEL7_1 <domain type='kvm' id='6'> <name>StressVM1_CPU_RHEL7_1</name> <uuid>12b8466c-491b-49ff-a063-fe40a180ff4a</uuid> <metadata> <ovirt:qos xmlns:ovirt="http://ovirt.org/vm/tune/1.0"> <ovirt:vcpuLimit>2</ovirt:vcpuLimit> </ovirt:qos> </metadata> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='4'>16</vcpu> <cputune> <shares>1020</shares> <period>25000</period> <quota>25000</quota> </cputune> <resource> Components been used: sanlock-3.2.2-2.el7.x86_64 qemu-kvm-rhev-2.1.2-23.el7.x86_64 libvirt-client-1.2.8-16.el7.x86_64 mom-0.4.1-4.el7ev.noarch vdsm-4.16.8.1-6.el7ev.x86_64 Linux version 3.10.0-227.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-7) (GCC) ) #1 SMP Tue Jan 27 11:55:32 EST 2015 Linux alma03.qa.lab.tlv.redhat.com 3.10.0-227.el7.x86_64 #1 SMP Tue Jan 27 11:55:32 EST 2015 x86_64 x86_64 x86_64 GNU/Linux rhevm-guest-agent-common-1.0.10-2.el6ev.noarch rhevm-3.5.0-0.31.el6ev.noarch
Previously I ran the guest VM with 4 virtual CPUs and feature not limited the CPU usage to 2%, although policy was set, I retested the same guest VM with 1 virtual CPU and feature works fine: <domain type='kvm' id='7'> <name>StressVM1_CPU_RHEL7_1</name> <uuid>12b8466c-491b-49ff-a063-fe40a180ff4a</uuid> <metadata> <ovirt:qos xmlns:ovirt="http://ovirt.org/vm/tune/1.0"> <ovirt:vcpuLimit>2</ovirt:vcpuLimit> </ovirt:qos> </metadata> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='1'>16</vcpu> <cputune> <shares>1020</shares> <period>25000</period> <quota>2000</quota> </cputune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>Red Hat</entry> <entry name='product'>RHEV Hypervisor</entry> <entry name='version'>7.1-0.3.el7</entry> <entry name='serial'>4C4C4544-0059-4410-8053-B7C04F573032</entry> <entry name='uuid'>12b8466c-491b-49ff-a063-fe40a180ff4a</entry> </system> </sysinfo> Please pay attention on different quota values between two scenarios, for 4 virtual CPU quota value of 25000 received, whereas for 1 virtual CPU it's 2000.
Verified on libvirt-client-1.1.1-29.el7_0.7.x86_64 On two limits: 25, 50 and with different amount of cpu's
rhev 3.5.0 was released. closing.