Bug 1098763 - Mass migration of 1500 VM's caused VDSErrorException: Failed to GetAllVmStatsVDS
Summary: Mass migration of 1500 VM's caused VDSErrorException: Failed to GetAllVmStatsVDS
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Francesco Romani
QA Contact:
URL:
Whiteboard: virt
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-18 08:00 UTC by Yuri Obshansky
Modified: 2014-05-21 12:12 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-05-21 12:12:49 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Log files (1.40 MB, application/x-gzip)
2014-05-20 14:04 UTC, Yuri Obshansky
no flags Details
libvirtd.log.gz (205.47 KB, application/x-gzip)
2014-05-20 14:05 UTC, Yuri Obshansky
no flags Details
vdsm.log (459.93 KB, application/x-gzip)
2014-05-20 14:06 UTC, Yuri Obshansky
no flags Details

Description Yuri Obshansky 2014-05-18 08:00:25 UTC
Description of problem:
Many VDSErrorException occurs during mass VMs migration which are running on fake hosts. 

Version-Release number of selected component (if applicable):
RHEVM 3.4.0-0.16.rc.el6ev
OS Version: RHEL - 6Server - 6.5.0.1.el6
Kernel Version: 2.6.32 - 431.5.1.el6.x86_64
KVM Version: 0.12.1.2 - 2.415.el6_5.6
LIBVIRT Version: libvirt-0.10.2-29.el6_5.5
VDSM Version: vdsm-4.14.7-0.2.rc.el6ev

How reproducible:
Environment: 2 Data Center (Real and Fake),
2 NFS Storage Domains (Real and Fake),
4 real hosts (24 CPU, 64 G),
200 real VMs,
200 fake hosts,
1500 fake VMs.
Perform JMeter script with 100 threads for migrate fake VMS
VDSErrorException occurs during mass VMs migration.

Steps to Reproduce:
1.
2.
3.

Actual results:
1.
2014-05-14 07:48:26,966 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-11) [390924d3] Failed in MigrateStatusVDS method
2014-05-14 07:48:26,967 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-11) [390924d3] Command org.ovirt.engine.core.vdsbroker.vdsbro
ker.MigrateStatusVDSCommand return value
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=12, mMessage=Fatal error during migration]]
2014-05-14 07:48:26,967 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-11) [390924d3] HostName = fake_host_69
2014-05-14 07:48:26,967 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-11) [390924d3] Command MigrateStatusVDSCommand(HostName = fake_host_69, HostId = c2d01b2e-e1bd-4233-af26-379a186a7e71, vmId=098bdb18-cc21-4e7a-9838-df512a5609a6) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateStatusVDS, error = Fatal error during migration, code = 12
2014-05-14 07:48:26,967 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-11) [390924d3] FINISH, MigrateStatusVDSCommand, log id: 7811c0df
2014-05-14 07:48:26,972 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-4-thread-11) [390924d3] Correlation ID: 7bbf5060, Job ID: 874a688b-9f0e-409f-bb1c-c0794745596e, Call Stack: null, Custom Event ID: -1, Message: Migration failed due to Error: Fatal error during migration. Trying to migrate to another Host (VM: fake_490, Source: fake_host_69, Destination: fake_host_105).

2.
2014-05-14 08:02:05,865 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (DefaultQuartzScheduler_Worker-28) [34065ecb] START, DestroyVDSCommand(HostName = fake_host_32, HostId = 78b0e5a2-08df-43f5-8004-67285ce4a2c2, vmId=209701d0-6e97-4cf6-ab92-ba4dfe410686, force=false, secondsToWait=0, gracefully=false), log id: 316fcb34
2014-05-14 08:02:05,866 WARN [org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil] (org.ovirt.thread.pool-4-thread-13) Executing a command: java.util.concurrent.FutureTask , but note that there are 16 tasks in the queue.
2014-05-14 08:02:05,873 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-94) [1d49a2f0] Failed to refresh VDS , vds = 03d530a7-fb23-4131-a85d-54346028ce4b : fake_host_3, error = org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16, continuing.: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createDefaultConcreteException(VdsBrokerCommand.java:61) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.createException(BrokerCommandBase.java:199) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:186) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand.executeVdsBrokerCommand(GetAllVmStatsVDSCommand.java:21) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:96) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:31) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:413) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.fetchRunningVms(VdsUpdateRunTimeInfo.java:954) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVmStats(VdsUpdateRunTimeInfo.java:910) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo(VdsUpdateRunTimeInfo.java:507) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refresh(VdsUpdateRunTimeInfo.java:344) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:231) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) [:1.7.0_55]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_55]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:] 

Expected results:
no exception

Additional info:

Comment 1 Omer Frenkel 2014-05-19 07:37:43 UTC
please attach full engine.log
please attach vdsm.log and libvirtd.log for source and destination hosts for one of the vms that failed migration
please also attach vdsm.log of host fake_host_3 where GetAllVmStats failed

Comment 2 Yuri Obshansky 2014-05-20 14:04:32 UTC
Created attachment 897615 [details]
Log files

I reproduced that issue again and got:
2014-05-20 12:53:40,178 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler_Worker-51) [2ce2411c] Failed in GetAllVmStatsVDS method
2014-05-20 12:53:40,179 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler_Worker-51) [2ce2411c] Command GetAllVmStatsVDSCommand(HostName = fake_host_192, HostId = 8e59d93d-f3fe-4cf5-8231-156d98af0e1a, vds=Host[fake_host_192]) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16
2014-05-20 12:53:40,179 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-51) [2ce2411c] Failed vds listing,  vds = 8e59d93d-f3fe-4cf5-8231-156d98af0e1a : fake_host_192, error = org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16
2014-05-20 12:53:40,199 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-51) [2ce2411c] Failed to refresh VDS , vds = 8e59d93d-f3fe-4cf5-8231-156d98af0e1a : fake_host_192, error = org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16, continuing.: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Unexpected exception, code = 16
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createDefaultConcreteException(VdsBrokerCommand.java:61) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.createException(BrokerCommandBase.java:199) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:186) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand.executeVdsBrokerCommand(GetAllVmStatsVDSCommand.java:21) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:96) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:31) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:413) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.fetchRunningVms(VdsUpdateRunTimeInfo.java:954) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVmStats(VdsUpdateRunTimeInfo.java:910) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo(VdsUpdateRunTimeInfo.java:507) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refresh(VdsUpdateRunTimeInfo.java:344) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:231) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) [:1.7.0_55]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_55]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]

Please, find attached engine.log and vdsm.log and libvirtd.log (from host fake_host_192 which appears in exception)

Comment 3 Yuri Obshansky 2014-05-20 14:05:50 UTC
Created attachment 897617 [details]
libvirtd.log.gz

Comment 4 Yuri Obshansky 2014-05-20 14:06:24 UTC
Created attachment 897618 [details]
vdsm.log

Comment 5 Francesco Romani 2014-05-21 09:57:20 UTC
Looks like you've been hit by
https://bugzilla.redhat.com/show_bug.cgi?id=1098148

which will be included in version 3.4.1

Still looking into the logs to see if there is something more

Comment 6 Francesco Romani 2014-05-21 12:12:49 UTC
Di not find anything else alarming. Feel free to reopen and to point me to anything I may have missed.


Note You need to log in before you can comment on or make changes to this bug.