Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1139692

Summary: Restarting oVirt-engine service while having a certain number(between 15-35) of active tasks,causes server internal error 500
Product: Red Hat Enterprise Virtualization Manager Reporter: Ori Gofen <ogofen>
Component: ovirt-engineAssignee: Ravi Nori <rnori>
Status: CLOSED NOTABUG QA Contact: Ori Gofen <ogofen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: acanan, ecohen, gklein, iheim, lpeer, ogofen, oourfali, pstehlik, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-09-15 13:24:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm+engine logs
none
vdsm+engine logs + image none

Description Ori Gofen 2014-09-09 13:05:35 UTC
Created attachment 935687 [details]
vdsm+engine logs

Description of problem:
The operation of oVirt-engine service restart does not handle well the existence of active tasks if the number of those tasks exceeds $CERTAIN_NUMBER.
And when I say "does not handle well" I mean,internal engine server NullPointer failure(see image).

After a while the engine comes back,but it reports so slowly I had to wipe the setup in order to regain the usual speed (tried to restart engine again,restart vdsm,reboot)

Version-Release number of selected component (if applicable):
rhev 3.5 vt2.2

How reproducible:
100%

Steps to Reproduce && Actual results:
In order to reproduce this bug,we need some how to cause the engine to accumulate tasks(mine had 35 before restarting the engine)

step 1: accumulating tasks
1.initiating an unsuccessful template creation (like BZ #1139678)
2.or you can delete a lot of disks with wipe after delete flag on

step 2: restart engine
1.after accumulating more than 25 tasks restart engine's service
2.Error 500 will appear,wait for engine to come back(it will take a while)
3. login again,engine reports incredibly slow


Expected results:
restarting oVirt-engine should know how to "kill" it's tasks safely,not causing such failures

Additional info:

Comment 1 Ravi Nori 2014-09-10 16:34:32 UTC
I am unable to reproduce this. From the logs it looks like the 500 error is related to database connection issue. All the other exceptions are related to BZ 1105211.

Please retest

Comment 2 Ori Gofen 2014-09-14 14:38:05 UTC
Created attachment 937340 [details]
vdsm+engine logs + image

(In reply to Ravi Nori from comment #1)
> I am unable to reproduce this. From the logs it looks like the 500 error is
> related to database connection issue. All the other exceptions are related
> to BZ 1105211.
> 
> Please retest

1) I have divided this bug into two bugs,BZ #1141540 deals with database connections problems due to multi-diskRemove operations while this bug will deal with restarting oVirt-engine after a several fails to create template because of broken volume chain.

2) When retesting this bug on rhev vt3.1 (accumulate template failures and restart engine) no NPE was thrown on engine logs,though,NullPointer message did appeared on browser(see image).

Comment 3 Oved Ourfali 2014-09-15 13:24:51 UTC
If you'r logged in to webadmin and the engine is restarted you might experience various exceptions.
Closing as NOTABUG.