Bug 1287083 - Engine Adding Disk task shows up as Executing while VDSM shows this task as finished
Summary: Engine Adding Disk task shows up as Executing while VDSM shows this task as f...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.0.0-alpha
: ---
Assignee: Ravi Nori
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-01 13:27 UTC by Gilad Lazarovich
Modified: 2016-06-23 04:52 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-13 07:36:12 UTC
oVirt Team: Infra
Embargoed:
oourfali: ovirt-4.0.0?
glazarov: planning_ack?
glazarov: devel_ack?
glazarov: testing_ack?


Attachments (Terms of Use)
Engine and vdsm logs plus engine DB dump (3.08 MB, application/octet-stream)
2015-12-02 13:01 UTC, Gilad Lazarovich
no flags Details

Description Gilad Lazarovich 2015-12-01 13:27:07 UTC
Description of problem:
Adding Disk task is stuck on Executing (Creating Volume) in engine, vdsm host (SPM) shows the task as finished

Version-Release number of selected component (if applicable):
3.6.1-0.2.el6

How reproducible:
2 such tasks showed up in one run

Steps to Reproduce:
In my case, I executed all the storage API Tier 2 cases (for iSCSI, NFS and GlusterFS)

Actual results:
2 tasks are showing up in progress on the engine while they show up as finished in the vdsm host

Expected results:
The task state should be in sync between the engine and the vdsm hosts

Additional info:
Here's the output from the SPM host:
[root@lynx09 ~]# vdsClient -s 0 getAllTasks
7425d4d5-4727-4d36-ae39-3eda492472eb :
	 verb = createVolume
	 code = 0
	 state = finished
	 tag = spm
	 result = {'uuid': '2e5d8baa-8347-43c0-9038-5fee759ccbc8'}
	 message = 1 jobs completed successfully
	 id = 7425d4d5-4727-4d36-ae39-3eda492472eb
8e09d7f8-08b3-4977-aab2-f4b54b969575 :
	 verb = createVolume
	 code = 0
	 state = finished
	 tag = spm
	 result = {'uuid': '9cf56973-5821-4ac2-849e-830718db4085'}
	 message = 1 jobs completed successfully
	 id = 8e09d7f8-08b3-4977-aab2-f4b54b969575

See attached logs including the engine's db dump

Comment 1 Oved Ourfali 2015-12-02 06:02:21 UTC
Please provide logs. 
Also, for how long does it remain that way?

Comment 2 Oved Ourfali 2015-12-02 06:10:19 UTC
In addition, in how reproducible you need to specify whether it happens in EVERY run. If that happened once and you can't reproduce it then it means it doesn't reproduce much.

Comment 3 Gilad Lazarovich 2015-12-02 13:01:15 UTC
Created attachment 1101468 [details]
Engine and vdsm logs plus engine DB dump

Comment 4 Gilad Lazarovich 2015-12-02 13:07:22 UTC
Oved, please find the attachment containing the logs and DB dump.  It stayed this way for over 24 hours, I had to manually clean it up so I can use the environment for further tests.  I ran into 2 such zombie tasks (within in hour) in one 23 hour run. The last time we hit this was a few months back.

Comment 5 Oved Ourfali 2015-12-02 13:13:44 UTC
(In reply to Gilad Lazarovich from comment #4)
> Oved, please find the attachment containing the logs and DB dump.  It stayed
> this way for over 24 hours, I had to manually clean it up so I can use the
> environment for further tests.  I ran into 2 such zombie tasks (within in
> hour) in one 23 hour run. The last time we hit this was a few months back.

So if it reproduces once every few months, then I'm removing the automation blocker, and severity.

Ravi - can you look at the logs and see what you can find?

Comment 6 Red Hat Bugzilla Rules Engine 2015-12-02 13:13:48 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 7 Oved Ourfali 2015-12-02 13:14:41 UTC
Removing also the regression flag.
If it happens so rarely it might have been a race and also happened before.

Comment 8 Oved Ourfali 2016-01-04 13:54:21 UTC
Gilad - please contact Ravi directly in case this reproduces.
We didn't see anything suspicious in the logs.

Currently targeting to 4.0 as without a live reproduction we can't do much here.

Comment 9 Oved Ourfali 2016-03-13 07:36:12 UTC
Please re-open if reproduces.


Note You need to log in before you can comment on or make changes to this bug.