Bug 975445

Summary: engine: after power outage could not connect to the webadmin until an engine host reboot
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: ovirt-engineAssignee: Nobody's working on this, feel free to take it <nobody>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, dyasny, iheim, juan.hernandez, lpeer, Rhev-m-bugs, yeylon, ykaul, yzaslavs
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0Flags: acathrow: Triaged+
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-30 11:48:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-06-18 13:35:18 UTC
Created attachment 762476 [details]
logs

Description of problem:

after a power outage rhevm-3 we could not connect to rhevm-3 webadmin. 
looking at the server.log we had the following errors: 

2013-06-18 16:15:27,849 INFO  [org.apache.commons.httpclient.HttpMethodDirector] (pool-3-thread-20) I/O exception (java.net.ConnectException) caught when processing request: Connection refused

Yair took a look and it seemed that the engine crashed because of connection issues to the db. 

attaching full logs 

setting this bug as high and not urgent since a host reboot solved the issue 

Version-Release number of selected component (if applicable):

sf17.2

How reproducible:

production system - unknown reproduction

Steps to Reproduce:
1. install a rhevm with local db and running vms and reports
2. hard shut down the engine host
3. start the host

Actual results:

we fail to log in to the webadmin

Expected results:

we should be able to recover and log in to the webadmin

Additional info: logs

Comment 1 Juan Hernández 2013-06-18 15:49:32 UTC
The error message that in the description comes from the HttpMethodDirectory and that is part of the communication with VDSM, not related to the database. What makes you think that this is problem related to the database?

Comment 2 Yair Zaslavsky 2013-06-26 14:07:45 UTC
Juan, there were DB issues at the log, I suspected due to that Backend bean was not deployed, so Dafna could not perform a login.
The bug is about connecting from UI to the engine, and not about the communication with VDSM (which is another issue in the environment).

Attaching the part that raises my suspicion - 

VdsCpuVdsLoadBalancingAlgorithm] (QuartzScheduler_Worker-13) [10d9037c] VdsLoadBalancer: number of over utilized vdss found: 0.
2013-06-18 11:04:03,859 INFO  [org.ovirt.engine.core.bll.VdsCpuVdsLoadBalancingAlgorithm]2013-06-18 12:28:08,213 ERROR [org.ovirt.engine.core.utils.ejb.EJBUtilsStrategy] (ServerService Thread Pool -- 41) Error looking up resource DATA_SOURCE
2013-06-18 12:28:08,216 ERROR [org.ovirt.engine.core.dal.dbbroker.DbFacadeLocator] (ServerService Thread Pool -- 41) Unable to locate DbFacade instance: java.lang.RuntimeException: Datasource is not defined 
	at org.ovirt.engine.core.dal.dbbroker.DbFacadeLocator.<clinit>(DbFacadeLocator.java:34) [engine-dal.jar:]
	at org.ovirt.engine.core.dal.dbbroker.DbFacade.getInstance(DbFacade.java:212) [engine-dal.jar:]
	at org.ovirt.engine.core.bll.Backend.checkDBConnectivity(Backend.java:126) [engine-bll.jar:]
	at org.ovirt.engine.core.bll.Backend.create(Backend.java:120) [engine-bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_09-icedtea]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_09-icedtea]

Comment 3 Juan Hernández 2013-06-26 16:38:19 UTC
According to the stack trace in comment 2 I would say that this is a duplicate of 879904.

Comment 4 Yair Zaslavsky 2013-06-30 11:48:10 UTC

*** This bug has been marked as a duplicate of bug 879904 ***