|Summary:||removing several vm pools together may fail|
|Product:||[oVirt] ovirt-engine||Reporter:||sefi litmanovich <slitmano>|
|Component:||BLL.Virt||Assignee:||Shmuel Melamud <smelamud>|
|Status:||CLOSED WORKSFORME||QA Contact:||meital avital <mavital>|
|Version:||4.1.0||CC:||bugs, shavivi, tjelinek|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|Last Closed:||2017-02-01 12:36:21 UTC||Type:||Bug|
|oVirt Team:||Virt||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description sefi litmanovich 2017-01-26 09:30:39 UTC
Created attachment 1244653 [details] engine log, vdsm logs from both hosts, engine-backup file Description of problem: It seems that in some scenarios where there are several vm pools in the system and we attempt to remove them at the same time, at least some of them will fail to be removed leaving some vms in the system, sometimes detached from the pool, sometimes still attached. We hit this in our automation and then when I tried to re produce it I was able to do so 3 times with the following scenario: Steps to Reproduce: 1. Have a pool (auto, stateless) with 5 vms, 3 of them pre started and running. 2. Have a second pool with 3 vms, not running. 3. Invoke removal of both pools async. 4. Immediately create a new pool Actual results: At least one of the pools (in all attempts the first pool for sure) will fail to complete remove vmpool action, leaving a vm or two detached. In one attempt it left the remove vm pool task stuck in job table in STARTED status (attaching a DB dump of the system with this task, created with engine-backup tool). Expected results: Both pools are removed successfully. I'm not sure step 4 is a must, and this might happen if we load the system with other pool related tasks, if needed I can try to create more scenarios, but this so far worked. Version-Release number of selected component (if applicable): rhevm-22.214.171.124-0.2.el7.noarch How reproducible: not 100% but most of the times. Additional info:
Comment 1 Red Hat Bugzilla Rules Engine 2017-02-01 10:32:28 UTC
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Comment 2 Shahar Havivi 2017-02-01 12:36:21 UTC
We cannot reproduce the error on 4.1 If there is a new flow that you encounter that cause the race open a new bug with the appropriate steps to reproduce.