Bug 947821

Summary: [scalability] [vdsm] Get error AdvancedStatsFunction _sampleCpu, via power off mass VMs
Product: Red Hat Enterprise Virtualization Manager Reporter: vvyazmin <vvyazmin>
Component: vdsmAssignee: Nobody's working on this, feel free to take it <nobody>
Status: CLOSED CANTFIX QA Contact: meital avital <mavital>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, bazulay, iheim, jkt, lpeer, michal.skrivanek, yeylon
Target Milestone: ---   
Target Release: 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: virt
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-31 10:38:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
## Logs vdsm, rhevm, libvirt
none
## Log libvirt none

Description vvyazmin@redhat.com 2013-04-03 11:20:53 UTC
Created attachment 731117 [details]
## Logs vdsm, rhevm, libvirt

Description of problem:
Get ERROR: Stats function failed: <AdvancedStatsFunction _sampleCpu at 0x2531960>, via power off mass VMs

Version-Release number of selected component (if applicable):
RHEVM 3.2 - SF11 environment:

RHEVM: rhevm-3.2.0-10.14.beta1.el6ev.noarch	
VDSM: vdsm-4.10.2-11.0.el6ev.x86_64
LIBVIRT: libvirt-0.10.2-18.el6.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.2.x86_64
SANLOCK: sanlock-2.6-2.el6.x86_64

Run scenarios:
1. Run power-off 200 VM's action, paralleled via PythonSDK

How reproducible:
100%

Steps to Reproduce:
1. Run power-off 200 VM's action, paralleled via PythonSDK
  
Actual results:
Get error libvirtError: internal error No such domain

Expected results:
No exception should be found

Additional info:
/var/log/ovirt-engine/engine.log

/var/log/vdsm/vdsm.log

Thread-130728::ERROR::2013-04-03 11:25:05,585::utils::416::vm.Vm::(collect) vmId=`c1ba38e5-84e2-4ed7-a744-5aca025c9d7f`::Stats function failed: <AdvancedStatsFunction _sampleCpu
 at 0x2531960>
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 412, in collect
    statsFunction()
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 287, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/libvirtvm.py", line 162, in _sampleCpu
    cpuStats = self._vm._dom.getCPUStats(True, 0)
  File "/usr/share/vdsm/libvirtvm.py", line 529, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 104, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1849, in getCPUStats
    if ret is None: raise libvirtError ('virDomainGetCPUStats() failed', dom=self)
libvirtError: internal error No such domain <C1><BA>8<E5><84><E2>NקDZ<CA>^B\<9D>^?

Comment 1 vvyazmin@redhat.com 2013-04-03 15:20:11 UTC
Created attachment 731220 [details]
## Log libvirt

Comment 2 vvyazmin@redhat.com 2013-04-04 14:03:58 UTC
Impact on user:
None, just ERROR in log

Comment 3 Itamar Heim 2013-05-02 10:32:03 UTC
also this from closed dup:
Bug 947835 - [scalability] [vdsm] Get error AdvancedStatsFunction _highWrite, via power off mass VMs

Comment 4 Itamar Heim 2013-05-02 10:32:19 UTC
*** Bug 947835 has been marked as a duplicate of this bug. ***

Comment 5 Michal Skrivanek 2013-05-03 11:37:01 UTC
*** Bug 947826 has been marked as a duplicate of this bug. ***

Comment 9 Michal Skrivanek 2014-01-31 10:38:08 UTC
this is a common race we get with stat vs vm threads. needs more radical changes in statistics gathering, nothing concrete yet, maybe 4.0