Bug 1169879
| Summary: | Can't run VM with error: CanDoAction of action RunVm failed. Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_O BJECT_LOCKED | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Michael Burman <mburman> | ||||||
| Component: | ovirt-engine | Assignee: | Arik <ahadas> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | sefi litmanovich <slitmano> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.5.0 | CC: | gklein, lpeer, lsurette, mavital, mburman, michal.skrivanek, rbalakri, Rhev-m-bugs, sherold, srevivo, ykaul | ||||||
| Target Milestone: | ovirt-3.6.0-rc | Keywords: | ZStream | ||||||
| Target Release: | 3.6.0 | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1174815 (view as bug list) | Environment: | |||||||
| Last Closed: | 2016-04-20 01:10:12 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1172699 | ||||||||
| Bug Blocks: | 1174815 | ||||||||
| Attachments: |
|
||||||||
Created attachment 963806 [details]
screenshot
indeed the vm is locked with engine lock, but there is no clue in the logs how this happened. a restart to the engine will solve this, also using jmx console we can clear the lock, so not sure its urgent, unless we know how this is reproduced. can you please recall the steps to reproduce this? also, this happened to more than one vm? Have to say that i can't reproduce this issue, for now it happened with one VM. This VM failed to migrate as part of BZ 1169831, i shut it down and tried to run again and it fail to run. After further analysis, now it is clear why we get into this situation once the client sends multiple request to run the VM at the same time. We don't understand why it happened though (It used to happen because of a bug in the UI once..). - We get 5 requests to run the VM - The first command locks the VM and the other fail to acquire the lock - the can-do action of the first command succeed - Before running the run commands we sort them (including the ones that didn't acquire the locks or fail in the can-do-action) - In the sort process we map vm-id to command. since there are 5 commands with the same vm-id and we iterate them in the same order they were received, we end up with one command that failed to acquire the lock and thus its can-do-action returns false - Since the command that acquire the lock does not run, the lock does not release (actually no command will run) I'll filter the duplicated requests in the input, that solves it and I believe that will be enough for now, as we really don't know how the duplicated requests were created.. bug pushed to 3.5.1, removing from 3.5.0 trackers. Verified Issue doesn't appear on ovirt 3.6 and isn't reproducible. tested with ovirt-engine-3.6.0-0.0.master.20150519172219.git9a2e2b3.el6.noarch host: rhel 7.1 vdsm-4.17.0-912.git25a063d.el7.noarch |
Created attachment 963805 [details] engine log Description of problem: Can't run VM with error: CanDoAction of action RunVm failed. Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_O BJECT_LOCKED. After shutting down VM and then trying to run again, Operation canceled Error while executing action: Students_: Cannot run VM. Related operation is currently in progress. Please try again later. Looks like the VM is stuck in some task and stay in this stage. engine-10.35.161.37 VM name- Students_ VM id- d4e7dcb2-1bc7-4df2-be72-9dbfbce24b0f relevant time in logs- 15:42 - is there a time-out for VM if failing to run? - I believe restarting engine will solve this issue, but i will leave it like that for investigation. Version-Release number of selected component (if applicable): 3.5.0-0.22.el6ev How reproducible: Not sure this is reproducible.