+++ This bug was initially created as a clone of Bug #1099068 +++
Description of problem:
stalling calls to VDSM from withing a monitoring cycle might delay other important monitoring stuff, such as storgae domain monitoring.
Version-Release number of selected component (if applicable):
always - e.g when a VM was shutdown and VURTI needs to send a destory to VDSM
and the call stalls then the whole VURTI thread is stuck
Steps to Reproduce:
1. create some timout in the destroy call and see domain monitroing isn't being called while at it
other calls to VDSM couldn't be called, while the vds manager lock is held and 1 out of 2 connections to VDSM is not available
VURTI thread shouldn't stall on call to VDSM for VM realted stuff.
VURTI shall contain VDS only related logic and thus won't need to call VDSM for other VM related call
VdsManager lock should be free while VDSM calls are in progree and not complete (i.e throughout the lifetime of the network use)
this is definitely not recommended to push to 3.5.z
(In reply to Michal Skrivanek from comment #5)
> done upstream.
> this is definitely not recommended to push to 3.5.z
Michal - I don't see any patch references so I can't tell by myself, but is this included in the last build delivered to QE?
if you mean 3.6 then yes.
I nack 3.5.z backport
Based on this comment:
(In reply to Michal Skrivanek from comment #7)
> if you mean 3.6 then yes.
Moving to ON_QA
> I nack 3.5.z backport
Removing 3.5.z flag
I accidently removed need info flag from yobshans
New flag created. please look at comment 12.
Unfortunately, we cannot reproduce that bug using regular RHEV-M scale setup and load test. We ran load tests with 100 concurrent threads which performed REST API calls ShutdownVM and StartVM. There was not detected any errors related to Storage during the test execution.
RHEV-M setup: 1 Data Center, 1 Cluster, 1/2 Hosts, 10 Storage Domains, 100 VMs.
Storage Domain is NFS.
You need provide more clear scenario how to reproduce it
(possible from customer experience).
any updates? see comment 14
(In reply to Eldad Marciano from comment #15)
> any updates? see comment 14
I don't have a clearer scenario.
If we can't reproduce, I suggest closing based on the work done in 3.6.0.
documented in bug 1099068
This bz is verified based on the verification results of bz #1099068
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.