Bug 1143992 - QOS CPU profile not working when guest agent is not functioning
Summary: QOS CPU profile not working when guest agent is not functioning
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: mom
Version: 3.5.0
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Roy Golan
QA Contact: Nikolai Sednev
URL:
Whiteboard: sla
: 1144280 (view as bug list)
Depends On:
Blocks: 906927 1084930 1162774 rhev35rcblocker rhev35gablocker 1174669
TreeView+ depends on / blocked
 
Reported: 2014-09-18 12:10 UTC by Nikolai Sednev
Modified: 2016-02-10 20:13 UTC (History)
14 users (show)

Fixed In Version: vdsm-4.16.8.1-3.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1174669 (view as bug list)
Environment:
Last Closed: 2015-02-11 20:27:37 UTC
oVirt Team: SLA


Attachments (Terms of Use)
logs (1.29 MB, application/octet-stream)
2014-09-18 12:10 UTC, Nikolai Sednev
no flags Details
screenshots (138.03 KB, application/x-gzip)
2014-10-21 08:17 UTC, Nikolai Sednev
no flags Details
mom.log (16.98 KB, text/plain)
2014-10-22 17:07 UTC, Nikolai Sednev
no flags Details
dump xmls from 2 hosts.tar.gz (2.93 KB, application/x-gzip)
2014-12-15 11:55 UTC, Nikolai Sednev
no flags Details
logs from both hosts and engine (1.95 MB, application/x-gzip)
2014-12-15 12:40 UTC, Nikolai Sednev
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:0186 normal SHIPPED_LIVE mom bug fix and enhancement update 2015-02-12 01:18:26 UTC
oVirt gerrit 34213 master MERGED vdsm: fix typo: vcpuLimit - 'limit' was in lower-case Never
oVirt gerrit 34528 master NEW GuestMemory fields should be optional Never
oVirt gerrit 35011 master MERGED Documentation: document the mandatory field behavior Never
oVirt gerrit 35329 master MERGED CpuTune - use previous value if quota or period is None Never
oVirt gerrit 35330 master MERGED mom.d: make CpuTuneEnabled True by default Never
oVirt gerrit 35407 None None None Never
oVirt gerrit 35653 ovirt-3.5 MERGED Instruct MOM to ignore ballooning when guest agent is not running Never
oVirt gerrit 35673 ovirt-3.5 MERGED Introduce new rhev_build spec file variable and use it for mom Never
oVirt gerrit 36026 ovirt-3.5.0 ABANDONED mom.d: make CpuTuneEnabled True by default Never
oVirt gerrit 36027 ovirt-3.5 MERGED mom.d: make CpuTuneEnabled True by default Never
oVirt gerrit 36205 ovirt-3.5 MERGED vdsm: fix typo: vcpuLimit - 'limit' was in lower-case Never

Description Nikolai Sednev 2014-09-18 12:10:06 UTC
Created attachment 938862 [details]
logs

Description of problem:
QOS CPU profile not working.
Assigned CPU policy of 80% to VM10_stress isn't working, while on background  VM3_stress and VM4_stress are running with default resource allocation (unlimited profile).

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.Create 3 VMs to run from single host.
2.Create CPU profile 80% and assign it to VM10_stress.
3.Run VM3_stress and VM4_stress with default CPU profile.
4.Load all 3 VMs to 100% CPU load, using stress linux distro from http://www.stresslinux.org/.

Actual results:
All 3 VMs loaded to 100% CPU, although VM10_stress should be loaded up to 80% only.

Expected results:
VM3_stress and VM4_stress CPU load=100%
VM10_stress CPU load=80%

Additional info:
logs from host and engine.

Comment 3 Roy Golan 2014-10-08 07:06:08 UTC
I see this error at vdsm.log 

GuestMonitor-VM10_stress::DEBUG::2014-09-18 15:08:16,500::vm::486::vm.Vm::(_getUserCpuTuneInfo) vmId=`7154e0ff-a1c6-4fe0-a4e8-9756d83e1529`::Domain Metadata is not set


I assume that without the metadata support in libvirt this would never work. 

btw I'm getting the same thing with libvirt version 1.1.3.6 on F20

please output your components versions.

Comment 4 Nikolai Sednev 2014-10-13 09:16:25 UTC
I used latest components then, but happens on these as well:
rhevm-3.5.0-0.14.beta.el6ev.noarch
libvirt-0.10.2-46.el6.x86_64
vdsm-4.16.6-1.el6ev.x86_64
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
sanlock-2.8-1.el6.x86_64

Comment 5 Martin Sivák 2014-10-15 07:43:51 UTC
CentOS 6.5 does not support metadata elements at all (libvirt limitation).

You will get the metadata error on F20 for VMs that have no metadata. We handle it gracefully, but it is still logged (there is a fix for that that was not merged to 3.5 branch + one related logging issue that is not ours).

Nikolai: if you test this on F19 or F20 it should work. If it doesn't, give us the following info:

virsh dumpxml VM10_stress
mom.log

Comment 6 Roy Golan 2014-10-20 10:58:37 UTC
(In reply to Martin Sivák from comment #5)
> CentOS 6.5 does not support metadata elements at all (libvirt limitation).
> 
> You will get the metadata error on F20 for VMs that have no metadata. We
> handle it gracefully, but it is still logged (there is a fix for that that
> was not merged to 3.5 branch + one related logging issue that is not ours).
> 
> Nikolai: if you test this on F19 or F20 it should work. If it doesn't, give
> us the following info:
> 
> virsh dumpxml VM10_stress
> mom.log


I think its RHEL 6.6 and the libvirt version is as stated at #4 0.10.2

the vdsm log from alma show 

Thread-38::DEBUG::2014-09-18 14:51:45,878::BindingXMLRPC::1132::vds::(wrapper) client [10.35.163.77]::call vmUpdateVmPolicy with ({'vmId': '7154e0ff-a1c6-4fe0-a4e8-9756d83e1529', 'vcpuLimit': '80'},) {}
Thread-38::DEBUG::2014-09-18 14:51:45,883::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 74 edom: 10 level: 2 message: argument unsupported: QEmu driver does not support modifying <metadata> element
Thread-38::ERROR::2014-09-18 14:51:45,883::vm::3795::vm.Vm::(updateVmPolicy) vmId=`7154e0ff-a1c6-4fe0-a4e8-9756d83e1529`::updateVmPolicy failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 3793, in updateVmPolicy
    METADATA_VM_TUNE_URI, 0)
  File "/usr/share/vdsm/virt/vm.py", line 670, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1597, in setMetadata
    if ret == -1: raise libvirtError ('virDomainSetMetadata() failed', dom=self)
libvirtError: argument unsupported: QEmu driver does not support modifying <metadata> element

Comment 7 Roy Golan 2014-10-20 14:40:34 UTC
Seems like version problems in the test setup and the api is working with libvirt 0.10.2-46.

Nikolai was able to get it running and now he's checking the actual cpu limit.

a separate bug should be opened for jsonrpc wasn't working with UpdateVmPolicy
because internally is had missing argument.

Nikolai please fill-in what's missing

Comment 8 Nikolai Sednev 2014-10-21 08:15:59 UTC
(In reply to Roy Golan from comment #7)
> Seems like version problems in the test setup and the api is working with
> libvirt 0.10.2-46.
> 
> Nikolai was able to get it running and now he's checking the actual cpu
> limit.
> 
> a separate bug should be opened for jsonrpc wasn't working with
> UpdateVmPolicy
> because internally is had missing argument.
> 
> Nikolai please fill-in what's missing

Even with Json disabled, feature not working.
Screen shots attached.

Comment 9 Nikolai Sednev 2014-10-21 08:17:39 UTC
Created attachment 948840 [details]
screenshots

Comment 10 Roy Golan 2014-10-21 08:51:50 UTC
we need logs from that machine, libvirt mom vdsm

Comment 11 Roy Golan 2014-10-22 11:33:01 UTC
BTW please make sure you hit Sync MoM policies at Cluster->Hosts subtab otherwise
I think MoM wouldn't be in sync. msivak correct me if I'm wrong

Comment 12 Nikolai Sednev 2014-10-22 17:07:01 UTC
I synced MoM policies at Cluster->Hosts sub-tab and attached logs as requested.
I didn't touched Json configs, reproduced on HE 3.5 environment.

Comment 13 Nikolai Sednev 2014-10-22 17:07:31 UTC
Created attachment 949501 [details]
mom.log

Comment 14 Roy Golan 2014-10-26 08:09:03 UTC
why libvirt and vdsm logs are missing? please add them.

Comment 15 Doron Fediuck 2014-10-26 08:28:22 UTC
Nikolai,
Regardless of the mom errors which may be unrelated, we need to know
this is not a libvirt bug.

Please create limitations using virsh both in rhel 6.6 and rhel 7
and open a libvirt bug if needed for the relevant rhel, which will make
this bz a test only bug.

Comment 16 Roy Golan 2014-10-28 07:24:30 UTC
what I see is:

cause: GuestMonitor isn't reporting back the vm entity because it is has a 
       boolean "ready = False".
       This means that non of the controllers will work. so CpuTune will
       never set the vcpu quota

root cause: GuestMemory collector is failing the validation phase of the fields
       because there is no guest-agent installed and non of the expected fields-
       swap_in, swap_out etc is reported. when one of the collectors
       fail it will mark the monitor as "ready = False".

the problem here is that a single validation failure of a collector fails the whole monitor cycle. maybe at the early days when the guest agent was a must that was ok but CpuTune works with the vm metadata.

suggesting that field check is wrong at Monitor.py:108

if not set(data).issuerset(self.fields):
            self._set_not_ready("Incomplete data: missing %s" % \
                                (self.fields - set(data)))

since set(data) will hold all the collected fields it may be a subset and not superset of all the possible fields (self.fields)

Comment 18 Roy Golan 2014-11-10 07:59:56 UTC
Since this bug could be easily worked around to let you test the feature and 
unblock the RFE I suggest to following:

1. test the scenario with guest-agent installed -

   that is aligned with our RHEV customers which are expected to install
   guest-agent.

2. thet with guest memory collector excluded from mom.conf
   
   excluding the GuestMemory in collectors key at /etc/vdsm/mom.conf will prevent
   the "failure" inside mom and continue to the CpuTune policy execution. 
   that is aligned with oVirt users who aren't expected to install guest-agent.
  
   #to exclude the GuestMemory
     sed -i '/collectors/ s/GuestMemory,// ' /etc/vdsm/mom.conf


3. write notes on this bug and remove the blocking flags

4. approve the RFE and put release-notes (if this bug isn't closed meanwhile)

Comment 19 Nikolai Sednev 2014-11-11 19:53:36 UTC
(In reply to Roy Golan from comment #18)
> Since this bug could be easily worked around to let you test the feature and 
> unblock the RFE I suggest to following:
> 
> 1. test the scenario with guest-agent installed -
> 
>    that is aligned with our RHEV customers which are expected to install
>    guest-agent.
> 
> 2. thet with guest memory collector excluded from mom.conf
>    
>    excluding the GuestMemory in collectors key at /etc/vdsm/mom.conf will
> prevent
>    the "failure" inside mom and continue to the CpuTune policy execution. 
>    that is aligned with oVirt users who aren't expected to install
> guest-agent.
>   
>    #to exclude the GuestMemory
>      sed -i '/collectors/ s/GuestMemory,// ' /etc/vdsm/mom.conf
> 
> 
> 3. write notes on this bug and remove the blocking flags
> 
> 4. approve the RFE and put release-notes (if this bug isn't closed meanwhile)

1.My setup uses LiveCD, from which VM boots each time its running and guest-agent component can't be installed, this is not possible, also some customers might be using live CDs, you can't force them to work with guest-agent, I tried to run on components as shown bellow and CPU SLA profile worked inaccurately (limitation didn't lowered VM CPUs to 10% as policy enforced, distributed load much higher, 54%/35%/44% for 3 running 
VMs and host was at 99% CPU load, but now limitation started to work) after mom config file was altered on each host as follows: "sed -i '/collectors/ s/GuestMemory,// ' /etc/vdsm/mom.conf". 

[root@blue-vdsc ~]# cat /etc/vdsm/mom.conf
### DO NOT REMOVE THIS COMMENT -- MOM Configuration for VDSM ###

[main]
# The wake up frequency of the main daemon (in seconds)
main-loop-interval: 5                                  

# The data collection interval for host statistics (in seconds)
host-monitor-interval: 5                                       

# The data collection interval for guest statistics (in seconds)
guest-monitor-interval: 5                                       

# The wake up frequency of the guest manager (in seconds).  The guest manager
# sets up monitoring and control for newly-created guests and cleans up after
# deleted guests.                                                            
guest-manager-interval: 5                                                    

# The interface MOM using to discover active guests and collect guest memory
# statistics. There're two choices for it: libvirt or vdsm.                 
hypervisor-interface: VDSM                                                  

# The wake up frequency of the policy engine (in seconds).  During each
# interval the policy engine evaluates the policy and passes the results
# to each enabled controller plugin.                                    
policy-engine-interval: 10                                              

# A comma-separated list of Controller plugins to enable
controllers: Balloon, KSM, CpuTune                      

# Sets the maximum number of statistic samples to keep for the purpose of
# calculating moving averages.
sample-history-length: 10

# Set this to an existing, writable directory to enable plotting.  For each
# invocation of the program a subdirectory momplot-NNN will be created where NNN
# is a sequence number.  Within that directory, tab-delimited data files will be
# created and updated with all data generated by the configured Collectors.
plot-dir:

# Activate the RPC server on the designated port (-1 to disable).  RPC is
# disabled by default until authentication is added to the protocol.
rpc-port: -1

# At startup, load a policy from the given directory.  If empty, no policy is loaded
policy-dir: /etc/vdsm/mom.d

[logging]
# Set the destination for program log messages.  This can be either 'stdio' or
# a filename.  When the log goes to a file, log rotation will be done
# automatically.
log: /var/log/vdsm/mom.log

# Set the logging verbosity level.  The following levels are supported:
# 5 or debug:     Debugging messages
# 4 or info:      Detailed messages concerning normal program operation
# 3 or warn:      Warning messages (program operation may be impacted)
# 2 or error:     Errors that severely impact program operation
# 1 or critical:  Emergency conditions
# This option can be specified by number or name.
verbosity: info

## The following two variables are used only when logging is directed to a file.
# Set the maximum size of a log file (in bytes) before it is rotated.
max-bytes: 2097152
# Set the maximum number of rotated logs to retain.
backup-count: 5

[host]
# A comma-separated list of Collector plugins to use for Host data collection.
collectors: HostMemory, HostKSM, HostCpu

[guest]
# A comma-separated list of Collector plugins to use for Guest data collection.
collectors: GuestQemuProc,  GuestBalloon, GuestCpuTune


2.I also installed RHEL6.5 VM with stress tool on it, and guest-agent, not worked for me, it was running on host together with HE and it wasn't limited, even I got thrown out of the engine's WEBUI session with error 501 or 503, not sure which of them.

Components as they appear on my setup:
rhevm-3.5.0-0.19.beta.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
libvirt-0.10.2-46.el6_6.1.x86_64
ovirt-hosted-engine-ha-1.2.4-1.el6ev.noarch
vdsm-4.16.7.3-1.el6ev.x86_64
ovirt-hosted-engine-setup-1.2.1-3.el6ev.noarch
sanlock-2.8-1.el6.x86_64
ovirt-host-deploy-1.3.0-1.el6ev.noarch

Regarding topics 3&4, please decide with PM. 

5.Please add "fixed in version" component version number as soon as you'll have it.

Comment 20 Roy Golan 2014-11-18 15:18:09 UTC
I verified using a python script libvirt on rhel 6.6 does support meta data
i.e libvirt version libvirt-0.10.2-46.el6_6.1.x86_64 as noted above.

so 3 issues currently;
1. json rpc verb updateVmPolicy - msivak has already posted patches for it in Bug 1120246

2. policy var cpuTuneEnabled is False by default
so the cputune policy is never calculating. need to set it to true in 00-defines.policy

3. policy is ignoring recalculation when period didn't change
if period didn't change then it has the value None which means the controller will not call setVmCpuTune

04-cputune.policy:21
(if (!= guest.vcpu_period calcPeriod)
        (guest.Control "vcpu_period" calcPeriod) 0)

CputTune.py:32
if quota is not None and period is not None:

Comment 21 Roy Golan 2014-11-23 14:54:18 UTC
patch 34528 is merged (its commit msg is missing the the Bug-Url so it appears as NEW)

added VDSM martin's patch to trackers

Comment 22 Doron Fediuck 2014-11-23 15:10:28 UTC
*** Bug 1144280 has been marked as a duplicate of this bug. ***

Comment 24 Nikolai Sednev 2014-12-09 14:28:07 UTC
Not working on vt13.1.

Comment 25 Roy Golan 2014-12-09 14:59:34 UTC
cpuTuneEnabled patch wasn't sent to ovirt-engine-3.5 which is my mistake
its still 0 by default

Comment 26 Roy Golan 2014-12-09 15:34:06 UTC
Niko, please elaborate if you got things working by changing /etc/vdsm/mom.d/00-defines cpuTuneEnabled 1  and restart vdsmd

Comment 27 Nikolai Sednev 2014-12-09 15:59:29 UTC
(In reply to Roy Golan from comment #26)
> Niko, please elaborate if you got things working by changing
> /etc/vdsm/mom.d/00-defines cpuTuneEnabled 1  and restart vdsmd

Looks better on host, where I modified by your tip the /etc/vdsm/mom.d/00-defines.policy config file and had set it to cpuTuneEnabled 1: 
# cat /etc/vdsm/mom.d/00-defines.policy
# This file defines python constans that make it easier to convert data
# received by setMOMPolicyParameters
(defvar False 0)
(defvar True 1)

# Define variables for configurable options here
(defvar ksmEnabled 1)
(defvar balloonEnabled 0)
(defvar cpuTuneEnabled 1)


Now looks like much better and seems like over-all feature started working on one of my hosts.

virt-top 17:41:29 - x86_64 2/2CPU 1999MHz 15948MB 93.4% 33.1% 96.6%
7 domains, 7 active, 7 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 84.0%  Mem: 7168 MB (7168 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME
    5 R    0    0  612    0 22.2  6.0   4:02.59 StressVM3
    2 R    0    1  674    0 14.1  6.0  12:41.95 RHEL6_5VM1
    7 R    0    0  612    0 13.4  6.0   4:08.65 StressVM2
    4 R    0    0  612    0  8.7  6.0   7:38.18 StressVM1
    6 R    0    0  612    0  8.7  6.0   3:52.52 StressVM4
    1 R    0    0  612    0  8.7  6.0   7:39.51 StressVM5
    3 R    0    0  612    0  8.1  6.0   7:52.37 StressVM6

Still CPU peaks can be seen, although 10% CPU limitation enforced, some of the VMs getting over it and even sometimes getting to 43%+ load, which means that policy isn't strict.

Comment 28 Doron Fediuck 2014-12-10 13:36:31 UTC
(In reply to Nikolai Sednev from comment #27)

> 
> Still CPU peaks can be seen, although 10% CPU limitation enforced, some of
> the VMs getting over it and even sometimes getting to 43%+ load, which means
> that policy isn't strict.

CPU QoS is not host CPU load. It's a compute unit. Remember that CPU load can
be more than 100%, and we ask cgroup to limit to a partial quota. The policy
is strict but depends on quota and period as defined in-
http://libvirt.org/formatdomain.html#elementsCPUTuning

Comment 29 Nikolai Sednev 2014-12-14 18:12:07 UTC
Still not fixed in 
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
libvirt-0.10.2-46.el6_6.2.x86_64
vdsm-4.16.8.1-3.el6ev.x86_64
ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
sanlock-2.8-1.el6.x86_64
ovirt-host-deploy-1.3.0-2.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch
ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
ovirt-host-deploy-1.3.0-2.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch

rhevm-3.5.0-0.25.el6ev.noarch

I see that on both hosts CPU usage in UI being shown 75%-89%, while for at 96%+ on hosts.
 
virt-top 20:07:05 - x86_64 4/4CPU 1600MHz 7872MB 85.8% 75.9%
5 domains, 5 active, 5 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 68.7%  Mem: 8192 MB (8192 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME
   28 R    0    0 1664    0 20.6 13.0   6:07.59 StressVM2
   29 R    0    0 1724    0 19.4 13.0   6:15.15 StressVM3
   30 R    0    0 1786    0 15.1 13.0   6:00.36 StressVM4
   31 R    0    0 1786    0 12.2 13.0   6:00.66 StressVM1
   24 R    0    6 6210 2332  1.4 52.0   3:56.84 HostedEngine

virt-top 20:07:21 - x86_64 2/2CPU 1999MHz 15948MB 82.6% 75.2% 92.2% 65.6% 97.4% 99.1% 45.8% 82.2% 96.2% 86.2% 45.4%
2 domains, 2 active, 2 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 98.8%  Mem: 2048 MB (2048 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME
   14 R    0    0  84K    0 49.4  6.0   9:03.48 StressVM5
   15 R    0    0  84K    0 49.4  6.0   9:21.57 StressVM6


After components were updated, config file appears correctly on both hosts:
cat /etc/vdsm/mom.d/00-defines.policy
# This file defines python constans that make it easier to convert data
# received by setMOMPolicyParameters
(defvar False 0)
(defvar True 1)

# Define variables for configurable options here
(defvar ksmEnabled 1)
(defvar balloonEnabled 0)
(defvar cpuTuneEnabled 1)


Guests are all without guest-agents.
Sometimes virt-top shows 42%+ over 10% limit for the first host, thus passing over its limit a way over.

Comment 30 Doron Fediuck 2014-12-15 10:16:54 UTC
Which mom version was used?
Can you please provide the relevant log files?

Comment 31 Nikolai Sednev 2014-12-15 11:34:02 UTC
mom-0.4.1-4.el6ev.noarch

Comment 32 Nikolai Sednev 2014-12-15 11:55:05 UTC
Created attachment 968921 [details]
dump xmls from 2 hosts.tar.gz

Comment 33 Nikolai Sednev 2014-12-15 11:59:13 UTC
Attached dump_xmls from both hosts while Json-RPC was active on both hosts and on both of them CPU SLA policy of 2% was running, while doesn't seems working:
mom-0.4.1-4.el6ev.noarch
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64
sanlock-2.8-1.el6.x86_64
vdsm-4.16.8.1-3.el6ev.x86_64
libvirt-0.10.2-46.el6_6.2.x86_64
ovirt-host-deploy-1.3.0-2.el6ev.noarch
ovirt-hosted-engine-setup-1.2.1-8.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-3.el6ev.noarch


brown-vdsd:
virt-top 13:56:44 - x86_64 2/2CPU 1999MHz 15948MB
2 domains, 2 active, 2 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 80.7%  Mem: 2048 MB (2048 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME
   25 R    0    0 1754    0 41.2  6.0  16:06.18 StressVM6
   24 R    0    0 1754    0 39.5  6.0  15:41.39 StressVM5

blue-vdsc:
virt-top 13:56:56 - x86_64 4/4CPU 1600MHz 7872MB
6 domains, 6 active, 6 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 74.4%  Mem: 9216 MB (9216 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME
   38 R    0    0 1930    0 24.1 13.0  15:20.22 StressVM2
   41 R    0    0 1930    0 23.7 13.0  16:24.16 StressVM3
   39 R    0    0 1992    0 12.5 13.0  15:34.40 StressVM4
   42 R    0    0    0    0 12.2 13.0  15:23.57 StressVM1
   32 R    0   13  23K 4771  1.7 52.0  10:48.93 HostedEngine
   40 R    0    0 1930    0  0.2 13.0   7:04.11 RHEL6_5VM1

Comment 34 Nikolai Sednev 2014-12-15 12:40:32 UTC
Created attachment 968937 [details]
logs from both hosts and engine

Comment 35 Martin Sivák 2014-12-16 12:27:10 UTC
There were two issues:

Nikolai's setup had broken configuration (manually modified config files that RPM refused to update).

VDSM bug that is now tracked in #1174669.

Comment 36 Nikolai Sednev 2014-12-24 15:11:11 UTC
works for me on these components:
vdsm-4.16.8.1-4.el7ev.x86_64
qemu-kvm-rhev-1.5.3-60.el7_0.11.x86_64
mom-0.4.1-4.el7ev.noarch
libvirt-client-1.2.8-10.el7.x86_64
sanlock-3.2.2-2.el7.x86_64
rhevm-3.5.0-0.26.el6ev.noarch

Comment 38 errata-xmlrpc 2015-02-11 20:27:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0186.html


Note You need to log in before you can comment on or make changes to this bug.