Bug 1349817

Summary: iotune policy queries raising exceptions in vdsm
Product: [oVirt] vdsm Reporter: Michal Skrivanek <michal.skrivanek>
Component: GeneralAssignee: Steven Rosenberg <srosenbe>
Status: CLOSED WORKSFORME QA Contact: Polina <pagranat>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.18.4CC: bugs, dfediuck, msivak
Target Milestone: ---Flags: michal.skrivanek: planning_ack?
michal.skrivanek: devel_ack?
michal.skrivanek: testing_ack?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-19 13:48:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1363728    
Bug Blocks:    

Description Michal Skrivanek 2016-06-24 10:51:02 UTC
During migration (and I suppose shutdown too) the queries coming from MOM are raising not-so-nice exceptions in log when the VM disappears in the meantime
At least the latter error should not happen, ideally the first one should be swallowed properly as well.
Logs:

VM migrated and notification has been sent:

Thread-1310::DEBUG::2016-06-24 13:42:30,446::__init__::207::jsonrpc.Notification::(emit) Sending event {"params": {"notify_time": 4371609240, "f6cbe8d2-2c72-4371-a
360-149690c93406": {"status": "Down", "timeOffset": "0", "exitReason": 4, "exitMessage": "Migration succeeded", "exitCode": 0}}, "jsonrpc": "2.0", "method": "|virt
|VM_status|f6cbe8d2-2c72-4371-a360-149690c93406"}

yet the call comes in asking about the VM:

Thread-1313::DEBUG::2016-06-24 13:42:37,704::bindingxmlrpc::1235::vds::(wrapper) client [::1]::call vmGetIoTunePolicy with ('f6cbe8d2-2c72-4371-a360-149690c93406',
) {}
Thread-1313::ERROR::2016-06-24 13:42:37,705::vm::2719::virt.vm::(_getVmPolicy) vmId=`f6cbe8d2-2c72-4371-a360-149690c93406`::getVmPolicy failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2716, in _getVmPolicy
    METADATA_VM_TUNE_URI, 0)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1464, in metadata
    if ret is None: raise libvirtError ('virDomainGetMetadata() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid 'f6cbe8d2-2c72-4371-a360-149690c93406' (rhel_67)
Thread-1313::ERROR::2016-06-24 13:42:37,706::bindingxmlrpc::1254::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/bindingxmlrpc.py", line 1238, in wrapper
    res = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/bindingxmlrpc.py", line 563, in vmGetIoTunePolicy
    return vm.getIoTunePolicy()
  File "/usr/share/vdsm/API.py", line 788, in getIoTunePolicy
    return v.getIoTunePolicy()
  File "/usr/share/vdsm/virt/vm.py", line 2737, in getIoTunePolicy
    ioTuneList = qos.getElementsByTagName("ioTune")
AttributeError: 'NoneType' object has no attribute 'getElementsByTagName'

Comment 1 Martin Sivák 2016-09-27 15:54:29 UTC
We can only solve this by monitoring the event stream between vdsm - engine, because we have no way of detecting a finished migration or stopped VM soon enough (the engine calls destroy almost immediately and the status in vdsm is then lost).

Comment 4 Michal Skrivanek 2017-07-26 09:12:35 UTC
the point of the bug is to eliminate a traceback, not necessarily all invocations of the API - that is fine and expected to happen. We just need to respond with a regular response with err code, not raising exceptions in vdsm code. Events would help, but not 100% resolve it I guess.

Comment 5 Martin Sivák 2017-11-27 12:52:07 UTC
I wonder if this still happens when done over jsonrpc.

Comment 6 Steven Rosenberg 2018-05-16 12:30:58 UTC
I attempted to simulate this by deploying two hosts and one VM from ovirt-engine. One host was running vdsm version 4.20.17-1 and the other 4.20.27.1-1. The mom versions were 0.5.11-1 for the older version and 0.5.12-1 for the current version.

I tested both migration between both hosts and shutting down the VM from both hosts. This error did not occur. For the host running mom version 0.5.11-1, there were errors when migrating from this host and shutting down the VM when running on the host and they were the same related to the code handling GuestCpuTune.

However, being that no errors occurred when migrating from, migrating to and shutting down the VM from the host running the current version, with changes to the code handling the GuestCpuTune, we may assume that both issues are no longer relevant.