Bug 1287083 - Engine Adding Disk task shows up as Executing while VDSM shows this task as finished
Engine Adding Disk task shows up as Executing while VDSM shows this task as f...
Status: CLOSED INSUFFICIENT_DATA
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-4.0.0-alpha
: ---
Assigned To: Ravi Nori
Pavel Stehlik
: Automation
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-01 08:27 EST by Gilad Lazarovich
Modified: 2016-06-23 00:52 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-13 03:36:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
oourfali: ovirt‑4.0.0?
glazarov: planning_ack?
glazarov: devel_ack?
glazarov: testing_ack?


Attachments (Terms of Use)
Engine and vdsm logs plus engine DB dump (3.08 MB, application/octet-stream)
2015-12-02 08:01 EST, Gilad Lazarovich
no flags Details

  None (edit)
Description Gilad Lazarovich 2015-12-01 08:27:07 EST
Description of problem:
Adding Disk task is stuck on Executing (Creating Volume) in engine, vdsm host (SPM) shows the task as finished

Version-Release number of selected component (if applicable):
3.6.1-0.2.el6

How reproducible:
2 such tasks showed up in one run

Steps to Reproduce:
In my case, I executed all the storage API Tier 2 cases (for iSCSI, NFS and GlusterFS)

Actual results:
2 tasks are showing up in progress on the engine while they show up as finished in the vdsm host

Expected results:
The task state should be in sync between the engine and the vdsm hosts

Additional info:
Here's the output from the SPM host:
[root@lynx09 ~]# vdsClient -s 0 getAllTasks
7425d4d5-4727-4d36-ae39-3eda492472eb :
	 verb = createVolume
	 code = 0
	 state = finished
	 tag = spm
	 result = {'uuid': '2e5d8baa-8347-43c0-9038-5fee759ccbc8'}
	 message = 1 jobs completed successfully
	 id = 7425d4d5-4727-4d36-ae39-3eda492472eb
8e09d7f8-08b3-4977-aab2-f4b54b969575 :
	 verb = createVolume
	 code = 0
	 state = finished
	 tag = spm
	 result = {'uuid': '9cf56973-5821-4ac2-849e-830718db4085'}
	 message = 1 jobs completed successfully
	 id = 8e09d7f8-08b3-4977-aab2-f4b54b969575

See attached logs including the engine's db dump
Comment 1 Oved Ourfali 2015-12-02 01:02:21 EST
Please provide logs. 
Also, for how long does it remain that way?
Comment 2 Oved Ourfali 2015-12-02 01:10:19 EST
In addition, in how reproducible you need to specify whether it happens in EVERY run. If that happened once and you can't reproduce it then it means it doesn't reproduce much.
Comment 3 Gilad Lazarovich 2015-12-02 08:01 EST
Created attachment 1101468 [details]
Engine and vdsm logs plus engine DB dump
Comment 4 Gilad Lazarovich 2015-12-02 08:07:22 EST
Oved, please find the attachment containing the logs and DB dump.  It stayed this way for over 24 hours, I had to manually clean it up so I can use the environment for further tests.  I ran into 2 such zombie tasks (within in hour) in one 23 hour run. The last time we hit this was a few months back.
Comment 5 Oved Ourfali 2015-12-02 08:13:44 EST
(In reply to Gilad Lazarovich from comment #4)
> Oved, please find the attachment containing the logs and DB dump.  It stayed
> this way for over 24 hours, I had to manually clean it up so I can use the
> environment for further tests.  I ran into 2 such zombie tasks (within in
> hour) in one 23 hour run. The last time we hit this was a few months back.

So if it reproduces once every few months, then I'm removing the automation blocker, and severity.

Ravi - can you look at the logs and see what you can find?
Comment 6 Red Hat Bugzilla Rules Engine 2015-12-02 08:13:48 EST
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Comment 7 Oved Ourfali 2015-12-02 08:14:41 EST
Removing also the regression flag.
If it happens so rarely it might have been a race and also happened before.
Comment 8 Oved Ourfali 2016-01-04 08:54:21 EST
Gilad - please contact Ravi directly in case this reproduces.
We didn't see anything suspicious in the logs.

Currently targeting to 4.0 as without a live reproduction we can't do much here.
Comment 9 Oved Ourfali 2016-03-13 03:36:12 EDT
Please re-open if reproduces.

Note You need to log in before you can comment on or make changes to this bug.