Bug 988339 - [scale] race - sometimes VM and VDS statuses is not being updated (host stuck in unassigned)
[scale] race - sometimes VM and VDS statuses is not being updated (host stuck...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
urgent Severity urgent
: ---
: 3.4.0
Assigned To: Roy Golan
Yuri Obshansky
infra
: ZStream
Depends On:
Blocks: 1008634 1060700 rhev3.4beta 1142926
  Show dependency treegraph
 
Reported: 2013-07-25 07:15 EDT by Pavel Zhukov
Modified: 2017-07-03 10:04 EDT (History)
21 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, when a host was stuck in an unassigned state, it could also cause virtual machines on other hosts to stop updating their status. This update adds a concurrent hash map for the internal event queue, which fixes this issue.
Story Points: ---
Clone Of:
: 1060700 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
eventq.btm (606 bytes, text/plain)
2013-07-28 08:12 EDT, Roy Golan
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 438093 None None None Never

  None (edit)
Description Pavel Zhukov 2013-07-25 07:15:43 EDT
Description of problem:
After some manipulation with the hosts, one of them went to "Unassigned" state for a long time (more than 20 hrs). Statuses of the VMs on all _other_ host are not being updated (VM can be launched without errors from the host/engine but status is 9 "waiting for launch). VMs can be powered off and launched again (status changed from 0 to 9 and vice versa, run_on_vds is changed as well). Free memory of the host is not being updated.  

Version-Release number of selected component (if applicable):
rhevm-3.2.1-0.39.el6ev.noarch

How reproducible:
Unknown. 2 systems are affected


Actual results:
One host is in Unassigned mode. 
New started VMs are in "Waiting for launch" status but actually up and running
Comment 10 Roy Golan 2013-07-28 08:12:09 EDT
Created attachment 779325 [details]
eventq.btm
Comment 23 Yair Zaslavsky 2013-08-20 08:05:13 EDT
Still needs to be investigated, postponing to 3.2.4
Comment 24 Barak 2013-09-16 07:41:43 EDT
This bug is about patch
Comment 26 Barak 2013-09-17 08:12:16 EDT
the patch was accepted upstream long time ago and it is already in 3.3,
I would like to test this scenario as a part of the scale testing for 3.3,
Hence moving to ON_QA
Comment 27 Charlie 2013-11-27 19:13:59 EST
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.
Comment 28 Shai Revivo 2014-01-15 07:14:22 EST
QE are unable to verify this scale bug for 3.3.
will verify in 3.4
Comment 30 Barak 2014-02-03 06:58:28 EST
Added a 3.3.z flag to test it for 3.3.zstream
Comment 33 Eldad Marciano 2014-05-13 10:19:00 EDT
How to reproduced the bug?
Comment 34 Eldad Marciano 2014-06-05 09:18:27 EDT
Tested on 3.4(latest) 3.4.0-0.21.el6ev

- I have created 37 hosts 
- running deactivate and active in high frequency.
- hosts being unassigned for 2-3 min and then status Ok.
Comment 35 Itamar Heim 2014-06-12 10:06:42 EDT
Closing as part of 3.4.0

Note You need to log in before you can comment on or make changes to this bug.