Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 881803

Summary: engine: loop in event log report on task that failed with NPE
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: ovirt-engineAssignee: Mooli Tayer <mtayer>
Status: CLOSED CURRENTRELEASE QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact:
Priority: high    
Version: 3.1.0CC: aberezin, acathrow, bazulay, dron, emesika, gklein, iheim, jkt, lpeer, lsvaty, mavital, pstehlik, Rhev-m-bugs, yeylon, yzaslavs
Target Milestone: ---   
Target Release: 3.4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-12 14:04:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs
none
lsvaty-engine.log none

Description Dafna Ron 2012-11-29 15:40:16 UTC
Created attachment 654373 [details]
logs

Description of problem:

I have a task that failed with NPE and the failure is reported in endless loop in event log:

	
2012-Nov-29, 17:31
	
Failed to complete creation of Template BUG from VM <UNKNOWN>.
	
2012-Nov-29, 17:31
	
Failed to complete creation of Template BUG from VM <UNKNOWN>.
	
2012-Nov-29, 17:31
	
Failed to complete creation of Template BUG from VM <UNKNOWN>.
	
2012-Nov-29, 17:31
	
Failed to complete creation of Template BUG from VM <UNKNOWN>.

Version-Release number of selected component (if applicable):

si24.5

How reproducible:

unknown. 

Steps to Reproduce:
1. I created a template from a vm 
2. the command was hanging so I reloaded the UI
3.
  
Actual results:

the command ended with NPE and event log started reporting the event in loop. 

Expected results:

we should not report the event in loop

Additional info: log


please note that the task is not cleaned:

root@gold-vdsc ~]# vdsClient -s 0 getAllTasksInfo
1e17cd6e-5e64-440e-a1d3-9c4e58ecf576 :
	verb = copyImage
	id = 1e17cd6e-5e64-440e-a1d3-9c4e58ecf576

Comment 1 Yair Zaslavsky 2012-12-30 10:44:54 UTC
This is a problem with async task mechanism.
AsyncTaskMechanism wily to perform endSuccessfully or endWithFailure.
If there is NPE - it can be stuck forever.
This requires infra change for async_tasks (such situations currently are handled by flow-related bugs -
Dafna, I suggest you open a bug related to the specific flow with the NPE).
Not sure we will be able to fix the "loop forever"  at AsyncTaskManager issue for 3.2.
Suggesting this to future.

Comment 2 Yair Zaslavsky 2013-10-07 15:36:37 UTC
Mooli, please check if this bug is still relevant.

Comment 3 Mooli Tayer 2013-10-15 12:42:04 UTC
Following Yair's comment, as for this fllow, The NPE seems to be thrown from a null vm in VmHandler.UnLockVm. Investigating AddVmTemplateCommand to see how this is possible.

Comment 4 Barak 2013-10-20 11:23:04 UTC
NPE in such flows usually happens due to compensation mechanism + restart of Engine while the task is being created.

However this is not the case described above.
We have investigated this bug and could not reproduce it according to the flow above.

Need to investigate whether it happens in engine restart (and the object disappears from the DB), and than see whether the phenomena still happens.

Comment 5 Eli Mesika 2014-02-14 09:51:08 UTC
Dafna , we could not reproduce that, do you have a clear reproducer or is it OK to close with WORKSFORME ?

Comment 8 Gil Klein 2014-02-23 10:27:55 UTC
Meital, Please try to reproduce this issue.

Comment 9 Lukas Svaty 2014-02-26 16:52:21 UTC
re-adding comment (Gil pointed out I might have add it incorrectly :)

came across same issue in older rhevm33 version, reproducible 5% of time (Could not reproduce for the log collection and reproduction steps.

steps to reproduce:
1. create template from VM
2. wait for VM status locked
3. in database change VM status in vm_dynamic table from 15 (locked) to 0 (down)
4. see engine log

At the moment there is no NPE, in my opinion is that this was fixed in previouse version. Still some errors appears, but engine recovered successfully. I suggest closing this as it is currently working and reopening if NPE appears in similar workflow.

attaching engine.log of my work-flow and stuck action of recovering from error

Comment 10 Lukas Svaty 2014-02-26 16:52:54 UTC
Created attachment 868105 [details]
lsvaty-engine.log

Comment 11 Eli Mesika 2014-02-26 20:54:38 UTC
Moving to ON-QA per comment 9

Comment 12 Lukas Svaty 2014-02-27 13:52:14 UTC
I suggest closing it as it is fixed in current release

Comment 13 Leonid Natapov 2014-03-11 13:37:27 UTC
3.4.0-0.3.master.el6ev.

Comment 14 Itamar Heim 2014-06-12 14:04:55 UTC
Closing as part of 3.4.0