Bug 1266579 - DestroyVDSCommand times out when hypervisor is under load
DestroyVDSCommand times out when hypervisor is under load
Status: CLOSED DUPLICATE of bug 1270220
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.4.5
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Dan Kenigsberg
Aharon Canan
virt
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-25 13:08 EDT by Tim Speetjens
Modified: 2015-11-05 06:24 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-05 06:24:37 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Tim Speetjens 2015-09-25 13:08:27 EDT
Description of problem:
Under load, the DestroyVDSCommand requires significantly more time which
may exceed vdsTimeout and as a result the HV/SPM is set to non-responsive

Version-Release number of selected component (if applicable):
vdsm-4.14.18-4.el6ev.x86_64

How reproducible:
Difficult, happens in an environment that is always busy.

Steps to Reproduce:
Not clear yet

Actual results:
TimeoutException in engine.log
Hypervisor / SPM is marked as non-operational

Expected results:
DestroyVDSCommand must return in a timely fashion, even under load
Comment 2 Dan Kenigsberg 2015-09-26 09:54:52 EDT
Can you specify the nature of the load? How many VMs? What is the CPU consumption? How many host CPUs? How loaded is the management network and its underlying NIC?

Would the customer be willing to test if the fix for bug 1247075 ? It task-setting Vdsm to a single CPU is reported to improve Vdsm responsiveness.
Comment 3 Tim Speetjens 2015-09-29 08:14:07 EDT
To have an idea about the load:

This environment is API driven, with many templates/vms created. In between the start and finish of the job in VDSM, lots of vmGetStats were seen, and also multiple disk creation activities.

The load of the hypervisors is not alarming for a dual CPU 8-core/16 thread system. Networks are used, but the timeouts, to my knowledge are not network related.

Unsure if they can test the patch easily.
Comment 4 Tim Speetjens 2015-11-05 06:24:37 EST

*** This bug has been marked as a duplicate of bug 1270220 ***

Note You need to log in before you can comment on or make changes to this bug.