Red Hat Bugzilla – Bug 1266579
DestroyVDSCommand times out when hypervisor is under load
Last modified: 2015-11-05 06:24:37 EST
Description of problem:
Under load, the DestroyVDSCommand requires significantly more time which
may exceed vdsTimeout and as a result the HV/SPM is set to non-responsive
Version-Release number of selected component (if applicable):
Difficult, happens in an environment that is always busy.
Steps to Reproduce:
Not clear yet
TimeoutException in engine.log
Hypervisor / SPM is marked as non-operational
DestroyVDSCommand must return in a timely fashion, even under load
Can you specify the nature of the load? How many VMs? What is the CPU consumption? How many host CPUs? How loaded is the management network and its underlying NIC?
Would the customer be willing to test if the fix for bug 1247075 ? It task-setting Vdsm to a single CPU is reported to improve Vdsm responsiveness.
To have an idea about the load:
This environment is API driven, with many templates/vms created. In between the start and finish of the job in VDSM, lots of vmGetStats were seen, and also multiple disk creation activities.
The load of the hypervisors is not alarming for a dual CPU 8-core/16 thread system. Networks are used, but the timeouts, to my knowledge are not network related.
Unsure if they can test the patch easily.
*** This bug has been marked as a duplicate of bug 1270220 ***