Bug 1299480 - Unhandled exception in <NumaInfoMonitor Error in vdsm log while migrating VM/s
Unhandled exception in <NumaInfoMonitor Error in vdsm log while migrating VM/s
Status: CLOSED CURRENTRELEASE
Product: vdsm
Classification: oVirt
Component: Core (Show other bugs)
---
x86_64 Linux
medium Severity medium (vote)
: ovirt-3.6.5
: ---
Assigned To: Francesco Romani
Michael Burman
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-18 08:34 EST by Michael Burman
Modified: 2016-04-21 10:38 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-21 10:38:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
michal.skrivanek: ovirt‑3.6.z?
mburman: planning_ack?
michal.skrivanek: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 52884 master MERGED periodic: ignore known-benign libvirt errors 2016-02-11 08:09 EST
oVirt gerrit 52894 master ABANDONED vmstats: remove useless short stacktrace 2016-02-19 04:32 EST
oVirt gerrit 52895 master ABANDONED periodic: silence useless log 2016-02-16 11:51 EST
oVirt gerrit 53056 master ABANDONED periodic: ignore VIR_ERROR_OPERATION_UNSUPPORTED 2016-07-08 07:32 EDT
oVirt gerrit 53539 master MERGED virt: periodic: disable on migration destination 2016-03-02 13:40 EST
oVirt gerrit 53612 master MERGED vmstats: remove _diff helper 2016-02-23 06:41 EST
oVirt gerrit 53613 master MERGED vmstats: from EAFP to LBYL 2016-02-23 07:59 EST
oVirt gerrit 53991 ovirt-3.6 MERGED vmstats: make nic_traffic private 2016-03-02 04:44 EST
oVirt gerrit 53992 ovirt-3.6 MERGED vmstats: handle known-missing stats 2016-03-02 06:04 EST
oVirt gerrit 53993 ovirt-3.6 MERGED tests: extend coverage for vmstats.disks() 2016-02-25 10:50 EST
oVirt gerrit 53994 ovirt-3.6 MERGED tests: improve vmstats.disks coverage 2016-02-25 10:50 EST
oVirt gerrit 53995 ovirt-3.6 MERGED virt: stats: make disk_rate more robust 2016-03-02 06:04 EST
oVirt gerrit 53996 ovirt-3.6 MERGED virt: stats: make compute_latency more robust 2016-03-02 06:05 EST
oVirt gerrit 53997 ovirt-3.6 MERGED virt: stats: make _disk_iops_bytes more robust 2016-03-02 06:05 EST
oVirt gerrit 53999 ovirt-3.6 MERGED periodic: ignore known-benign libvirt errors 2016-03-02 06:05 EST
oVirt gerrit 54000 ovirt-3.6 MERGED vmstats: remove _diff helper 2016-03-07 09:11 EST
oVirt gerrit 54001 ovirt-3.6 MERGED vmstats: from EAFP to LBYL 2016-03-07 09:11 EST
oVirt gerrit 54400 ovirt-3.6 MERGED virt: periodic: disable on migration destination 2016-03-08 06:22 EST

  None (edit)
Description Michael Burman 2016-01-18 08:34:05 EST
Description of problem:
Unhandled exception in <NumaInfoMonitor Error in vdsm log while migrating VM/s in cluster 3.6

The error is shown in the vdsm.log every few migration attempts.    
The migration finished with success.

periodic/1::ERROR::2016-01-18 15:21:13,290::executor::188::Executor::(_execute_task) Unhandled exception in <NumaInfoMonitor vm=7ebb5925-e76f-4d2f-a148-7671c313fe84 at 0x3312610>
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, in _execute_task
    callable()
  File "/usr/share/vdsm/virt/periodic.py", line 279, in __call__
    self._execute()
  File "/usr/share/vdsm/virt/periodic.py", line 324, in _execute
    self._vm.updateNumaInfo()
  File "/usr/share/vdsm/virt/vm.py", line 5071, in updateNumaInfo
    self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self)
  File "/usr/share/vdsm/numaUtils.py", line 106, in getVmNumaNodeRuntimeInfo
    _get_vcpu_positioning(vm))
  File "/usr/share/vdsm/numaUtils.py", line 129, in _get_vcpu_positioning
    return vm._dom.vcpus()[0]
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2751, in vcpus
    if ret == -1: raise libvirtError ('virDomainGetVcpus() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '7ebb5925-e76f-4d2f-a148-7671c313fe84' (v2)


periodic/2::ERROR::2016-01-18 15:21:45,927::executor::188::Executor::(_execute_task) Unhandled exception in <NumaInfoMonitor vm=404f96db-b224-4163-a21e-eeb8eb084d7b at 0x7f6298353310>
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, in _execute_task
    callable()
  File "/usr/share/vdsm/virt/periodic.py", line 279, in __call__
    self._execute()
  File "/usr/share/vdsm/virt/periodic.py", line 324, in _execute
    self._vm.updateNumaInfo()
  File "/usr/share/vdsm/virt/vm.py", line 5071, in updateNumaInfo
    self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self)
  File "/usr/share/vdsm/numaUtils.py", line 106, in getVmNumaNodeRuntimeInfo
    _get_vcpu_positioning(vm))
  File "/usr/share/vdsm/numaUtils.py", line 129, in _get_vcpu_positioning
    return vm._dom.vcpus()[0]
  File "/usr/share/vdsm/virt/virdomain.py", line 68, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2751, in vcpus
    if ret == -1: raise libvirtError ('virDomainGetVcpus() failed', dom=self)
libvirtError: Domain not found: no domain with matching uuid '404f96db-b224-4163-a21e-eeb8eb084d7b' (v4)
Version-Release number of selected component (if applicable):
3.6.2.5-0.1.el6
vdsm-4.17.17-0.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64
libvirt-1.2.17-13.el7_2.2.x86_64

How reproducible:
60-85

Steps to Reproduce:
1. Migrate a VM between 2 servers in 3.6 cluster 

Actual results:
Every few migration attempts there is an error in vdsm log 

Expected results:
Errors shouldn't spam the vdsm log
Comment 1 Francesco Romani 2016-01-29 09:32:56 EST
It is caused by a benign race between migration thread and periodic monitoring thread. It is just noise, but working on a patch.
Comment 2 Francesco Romani 2016-01-29 10:56:58 EST
the patch is supposed to fix not only the specific error described here but also to swallow benign error like this.
Comment 3 Red Hat Bugzilla Rules Engine 2016-01-31 17:47:34 EST
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 4 Red Hat Bugzilla Rules Engine 2016-01-31 17:47:34 EST
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 5 Francesco Romani 2016-02-11 08:10:56 EST
not yet MODIFIED, we need more patches, and we need backports to 3.6
Comment 6 Francesco Romani 2016-03-08 07:48:19 EST
http://gerrit.ovirt.org/53056 is not critical to fix this - can actually hide some bugs. Everything else is merged and backported, hence moving to MODIFIED.
Comment 7 Francesco Romani 2016-03-08 07:48:57 EST
I don't think this BZ requires doc_text. The user should just see less noise in the logs.
Comment 8 Michael Burman 2016-03-31 02:30:37 EDT
Verified on - 3.6.5-0.1.el6 and vdsm-4.17.25-0.el7ev.noarch

Note You need to log in before you can comment on or make changes to this bug.