Bug 1304734

Summary: MOM- Unexpected collection error : KeyError: 'memUsage'KeyError: 'balloonInfo' when migration failed
Product: [oVirt] mom Reporter: Shira Maximov <mshira>
Component: GeneralAssignee: Martin Sivák <msivak>
Status: CLOSED WONTFIX QA Contact: Shira Maximov <mshira>
Severity: medium Docs Contact:
Priority: low    
Version: 0.4.4CC: bugs, dfediuck, gklein, mshira, rgolan
Target Milestone: ---Keywords: Regression
Target Release: ---Flags: mshira: planning_ack?
mshira: devel_ack?
mshira: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-17 11:36:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
mom log none

Description Shira Maximov 2016-02-04 13:38:42 UTC
Description of problem:

The following exception happens after live migration failed.
probably because of a Race: 

2016-02-04 14:31:09,970 - mom.Monitor - ERROR - Unexpected collection error
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/mom/Monitor.py", line 95, in collect
    collected = c.collect()
  File "/usr/lib/python2.6/site-packages/mom/Collectors/GuestMemory.py", line 50, in collect
    stat = self.hypervisor_iface.getVmMemoryStats(self.uuid)
  File "/usr/lib/python2.6/site-packages/mom/HypervisorInterfaces/vdsmInterface.py", line 93, in getVmMemoryStats
    usage = int(response['statsList'][0]['memUsage'])
KeyError: 'memUsage'
2016-02-04 14:31:09,971 - mom.Monitor - ERROR - Unexpected collection error
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/mom/Monitor.py", line 95, in collect
    collected = c.collect()
  File "/usr/lib/python2.6/site-packages/mom/Collectors/GuestBalloon.py", line 41, in collect
    stat = self.hypervisor_iface.getVmBalloonInfo(self.uuid)
  File "/usr/lib/python2.6/site-packages/mom/HypervisorInterfaces/vdsmInterface.py", line 147, in getVmBalloonInfo
    balloon_info = response['statsList'][0]['balloonInfo']
KeyError: 'balloonInfo'


Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Manager Version: 3.5.7-0.1.el6ev 
mom-0.4.1-4.el6ev.noarch

How reproducible:
Randomly

Steps to Reproduce:
1. create a big VM that can't be migrate because it have bigger memory then Max the 'free Memory for scheduling new VMs' (on the host)
2. migrate the VM 
3. check in mom logs for the exception

Actual results:
the exception

Expected results:


Additional info:

Comment 1 Shira Maximov 2016-02-04 13:39:12 UTC
Created attachment 1121101 [details]
mom log

Comment 2 Martin Sivák 2016-02-04 15:21:03 UTC
1) What is the VDSM version?
2) Can you test that with the latest 3.5 mom? (0.4.1-5)
3) The log appears on the destination host right?

Comment 3 Shira Maximov 2016-02-04 15:32:17 UTC
1) vdsm-4.16.33-1.el6ev.x86_64
2) mom-0.4.1-4 is the latest, as appear in :http://bob.eng.lab.tlv.redhat.com/builds/latest_vt/el6/
3) yes

Comment 4 Red Hat Bugzilla Rules Engine 2016-02-04 16:19:58 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 5 Yaniv Kaul 2016-02-05 23:04:59 UTC
Why is it a high severity bug? What is the user consequence? I guess he's more worried about the failed migration than the traceback.

Comment 6 Doron Fediuck 2016-02-07 08:27:46 UTC
(In reply to Shira Maximov from comment #3)

> 3) yes

If this error is on the destination server as you specify,
it will have no functional impact of the failed VM, as the process is being
killed.

Did you notice any impact on mom's behavior?

Comment 7 Shira Maximov 2016-02-11 07:41:56 UTC
(In reply to Doron Fediuck from comment #6)
> (In reply to Shira Maximov from comment #3)
> 
> > 3) yes
> 
> If this error is on the destination server as you specify,
> it will have no functional impact of the failed VM, as the process is being
> killed.
> 
> Did you notice any impact on mom's behavior?

I didn't notice any impact.

Comment 8 Roy Golan 2016-02-17 11:38:08 UTC
That issue has no impact on the flow, no user impact. For now there is no justification in trying suppress this error.