Bug 1293644 - commands with mixed children types (CoCo/AsyncTasks) don't converge
commands with mixed children types (CoCo/AsyncTasks) don't converge
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity high (vote)
: ovirt-3.6.5
: ---
Assigned To: Daniel Erez
Aharon Canan
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-22 09:25 EST by Daniel Erez
Modified: 2016-04-21 10:42 EDT (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-21 10:42:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: exception+
ylavi: planning_ack+
tnisan: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)
engine.log (32.81 KB, text/plain)
2016-03-21 06:24 EDT, Ondra Machacek
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 50910 ovirt-engine-3.6 MERGED engine : Fix scenario where CoCo child executes Async Tasks 2015-12-23 12:59 EST
oVirt gerrit 50911 ovirt-engine-3.6 MERGED engine : Can't remove vm template - the disks are removed and the template stays locked 2015-12-23 15:08 EST
oVirt gerrit 50987 ovirt-engine-3.6 MERGED engine : After an command is finished tasks are not cleared and stay in executing status 2015-12-24 04:15 EST
oVirt gerrit 50988 ovirt-engine-3.6 MERGED engine : Importing an image from glance never finishes 2015-12-24 04:17 EST
oVirt gerrit 50989 ovirt-engine-3.6.2 MERGED engine : Fix scenario where CoCo child executes Async Tasks 2015-12-27 08:58 EST
oVirt gerrit 50990 ovirt-engine-3.6.2 MERGED engine : Can't remove vm template - the disks are removed and the template stays locked 2015-12-27 08:58 EST
oVirt gerrit 50991 ovirt-engine-3.6.2 MERGED engine : After an command is finished tasks are not cleared and stay in executing status 2015-12-27 08:59 EST
oVirt gerrit 50992 ovirt-engine-3.6.2 MERGED engine : Importing an image from glance never finishes 2015-12-27 08:59 EST

  None (edit)
Description Daniel Erez 2015-12-22 09:25:43 EST
Description of problem:
Commands having both CoCo and async tasks don't converge, i.e. neither complete successfully or fail.

Version-Release number of selected component (if applicable):
3.6

How reproducible:
100%

Steps to Reproduce:
1. Create a VM with an Image disk and a Cinder disk.
2. Create a snapshot.
3.

Actual results:
Action hangs infinitely.

Expected results:
Action should complete.

Additional info:
The issue is already solved on master by:
* https://gerrit.ovirt.org/#/c/47489/
* https://gerrit.ovirt.org/#/c/43971/
Comment 2 Ondra Machacek 2016-02-22 12:42:37 EST
After creating snapshot I can see in log:

2016-02-22 19:33:52,906 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler_Worker-42) [31c311e0] Waiting on child command id: '836f8d38-39cd-449a-ad18-311f576b20f0' type:'CreateCinderSnapshot' of 'CreateAllCinderSnapshots' (id: '8f5e6783-cac4-497e-8ca2-68ad9e182fc3') to complete
2016-02-22 19:34:03,017 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler_Worker-81) [31c311e0] Waiting on child command id: '836f8d38-39cd-449a-ad18-311f576b20f0' type:'CreateCinderSnapshot' of 'CreateAllCinderSnapshots' (id: '8f5e6783-cac4-497e-8ca2-68ad9e182fc3') to complete
2016-02-22 19:34:13,110 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler_Worker-22) [31c311e0] Waiting on child command id: '836f8d38-39cd-449a-ad18-311f576b20f0' type:'CreateCinderSnapshot' of 'CreateAllCinderSnapshots' (id: '8f5e6783-cac4-497e-8ca2-68ad9e182fc3') to complete
2016-02-22 19:34:23,282 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler_Worker-69) [31c311e0] Waiting on child command id: '836f8d38-39cd-449a-ad18-311f576b20f0' type:'CreateCinderSnapshot' of 'CreateAllCinderSnapshots' (id: '8f5e6783-cac4-497e-8ca2-68ad9e182fc3') to complete
Comment 3 Red Hat Bugzilla Rules Engine 2016-02-22 12:42:44 EST
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 4 Moti Asayag 2016-02-24 02:30:43 EST
Moving back to Daniel Erez who proposed the solution for this bug.
Comment 5 Daniel Erez 2016-02-24 02:45:44 EST
Hi Ondra,

* In which build did you reproduce the issue?
* Is it the exact same issue?
* Can you attach the full logs?

Thanks
Comment 6 Daniel Erez 2016-03-20 07:51:08 EDT
(In reply to Daniel Erez from comment #5)
> Hi Ondra,
> 
> * In which build did you reproduce the issue?
> * Is it the exact same issue?
> * Can you attach the full logs?
> 
> Thanks

Hi Pavel,

Can you please provide information regarding the aforementioned questions?

Thanks!
Comment 7 Ondra Machacek 2016-03-21 06:18:24 EDT
Hi Daniel,

I am very sorry for late reply, but I missed needinfo.
Today I've retested with latest '3.6.4-1' and the snapshot creation completed successfully, so moving on verified. Sorry once again.
Comment 8 Ondra Machacek 2016-03-21 06:24 EDT
Created attachment 1138527 [details]
engine.log

Oh, so taking back, it looks like everything is fine, but it's not. I can't run the vm. and in log I see:
I am unsure if it's exact same issue. snaphosts/disks status is OK, but task didn't finished correctly.

2016-03-21 12:17:27,826 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-6-thread-48) [5c47a1af] Command 'org.ovirt.engine.core.bll.RunVmCommand' faile
d: null
2016-03-21 12:17:27,826 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-6-thread-48) [5c47a1af] Exception: java.lang.NullPointerException
        at org.ovirt.engine.core.bll.storage.CinderBroker.updateConnectionInfoForDisk(CinderBroker.java:230) [bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommandBase.updateCinderDisksConnections(RunVmCommandBase.java:286) [bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommand.runVm(RunVmCommand.java:258) [bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommand.perform(RunVmCommand.java:435) [bll.jar:]
        at org.ovirt.engine.core.bll.RunVmCommand.executeVmCommand(RunVmCommand.java:362) [bll.jar:]
        at org.ovirt.engine.core.bll.VmCommand.executeCommand(VmCommand.java:104) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1215) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1359) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1982) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1396) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:378) [bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:202) [bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner.runCommands(MultipleActionsRunner.java:170) [bll.jar:]
        at org.ovirt.engine.core.bll.SortedMultipleActionsRunnerBase.runCommands(SortedMultipleActionsRunnerBase.java:20) [bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:179) [bll.jar:]
        at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:89) [utils.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_85]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_85]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_85]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_85]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_85]
2016-03-21 12:17:27,872 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-48) [5c47a1af] Correlation ID: 5c47a1af, Job ID: ad82dc56-889b-409e-abcd-62fb86857275, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM vm (User: admin@internal).
2016-03-21 12:17:28,944 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler_Worker-76) [5f5ccbc6] Waiting on child command id: 'edf0e7eb-1982-4ba4-9c3f-120306519f44' type:'CreateCinderSnapshot' of 'CreateAllCinderSnapshots' (id: '22173ea0-254a-4b94-ac87-d27ccca85243') to complete
Comment 9 Daniel Erez 2016-03-21 08:58:19 EDT
Hi Ondra,

* Did you reproduce the exact same scenario?
* Can you please attach vdsm logs as well?

Thanks!
Comment 10 Ondra Machacek 2016-03-21 10:49:20 EDT
Yes I've reproduces exactly same scenario, but it turns out it's issue with my openstack.
So I've tried with different cinder and it worked. Sorry for confusion, I am closing this bz.

Note You need to log in before you can comment on or make changes to this bug.