Bug 1369418
Summary: | [z-stream clone - 4.0.3] [InClusterUpgrade] Possible race condition with large amount of VMs in cluster | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ |
Component: | ovirt-engine | Assignee: | Arik <ahadas> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | sefi litmanovich <slitmano> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.6.8 | CC: | achareka, ahadas, gklein, jcoscia, kshukla, lsurette, mavital, mgoldboi, michal.skrivanek, mlibra, rbalakri, rgolan, Rhev-m-bugs, sbonazzo, srevivo, tjelinek, ykaul |
Target Milestone: | ovirt-4.0.3 | Keywords: | ZStream |
Target Release: | 4.0.3 | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
Update of compatibility version of a cluster with many running VMs that are installed with guest-agent could lead to a deadlock that fails the update.
Consequence:
In some cases, such clusters could not be upgraded to a newer compatibility version.
Fix:
Prevent the deadlock in the database from happening.
Result:
Cluster with many running VMs installed with guest-agent can be upgraded to newer compatibility version.
|
Story Points: | --- |
Clone Of: | 1366786 | Environment: | |
Last Closed: | 2016-09-15 08:06:42 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1366786 | ||
Bug Blocks: |
Comment 2
Sandro Bonazzola
2016-08-25 11:55:27 UTC
Verified with rhevm-4.0.3-0.1.el7ev.noarch. Had a cluster with 144 Vms running. Set the cluster to InClusterUpgrade - this change didn't invoke update of vm's configuration. Then changed cluster compatibility version from 3.6 to 4.0 (hosts were 4.0 all the time) and monitored the updateVm calls with tail on engine log and after it's done check in DB that indeed all the vms were updated. Repeated this several times (each time setting the cluster compatibility back to 3.6 via DB and the vm's custom compatibility version in vm_static from '3.6' to null - a bit "cheating there"). Ran the upgrade for 5 times, no race has occurred. Please advise if this test isn't sufficient. Didn't mention in comment 3 - but all the vms had rhevm-guest-agent running. (In reply to sefi litmanovich from comment #3) This test is good. |