Bug 1310590

Summary: [scale] - rhevm unreachable due to 'The connection attempt failed.'
Product: [oVirt] ovirt-engine Reporter: Eldad Marciano <emarcian>
Component: Backend.CoreAssignee: Nobody <nobody>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Eldad Marciano <emarcian>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.2CC: bugs, emarcian, oourfali
Target Milestone: ---Flags: oourfali: needinfo?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-04 17:44:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
engine & server logs none

Description Eldad Marciano 2016-02-22 10:15:39 UTC
Created attachment 1129210 [details]
engine & server logs

Description of problem:
some load test running over night.
after ~12 hours, engine web interface found as unreachable, due to The connection attempt failed.

the load test keeps running and has no errors.

when browsing to RHEVM all the JS and CSS files are unreachable due to 500 HTTP response, and the following message has printed '???obrand.welcome.welcome.text??? ???obrand.welcome.version???'

when browsing to RHEVM/api 
the following message has printed:
JBWEB000065: HTTP Status 500 - The connection attempt failed.

JBWEB000309: type JBWEB000066: Exception report

JBWEB000068: message The connection attempt failed.

JBWEB000069: description JBWEB000145: The server encountered an internal error that prevented it from fulfilling this request.

JBWEB000070: exception

Class: class org.ovirt.engine.core.extensions.mgr.ExtensionInvokeCommandFailedException
Input:
{Extkey[name=AAA_AUTHN_CREDENTIALS;type=class java.lang.String;uuid=AAA_AUTHN_CREDENTIALS[03b96485-4bb5-4592-8167-810a5c909706];]=***, Extkey[name=EXTENSION_INVOKE_CONTEXT;type=class org.ovirt.engine.api.extensions.ExtMap;uuid=EXTENSION_INVOKE_CONTEXT[886d2ebb-312a-49ae-9cc3-e1f849834b7d];]={Extkey[name=EXTENSION_INTERFACE_VERSION_MAX;type=class java.lang.Integer;uuid=EXTENSION_INTERFACE_VERSION_MAX[f4cff49f-2717-4901-8ee9-df362446e3e7];]=0, Extkey[name=EXTENSION_LICENSE;type=class java.lang.String;uuid=EXTENSION_LICENSE[8a61ad65-054c-4e31-9c6d-1ca4d60a4c18];]=ASL 2.0, Extkey[name=EXTENSION_NOTES;type=class java.lang.String;uuid=EXTENSION_NOTES[2da5ad7e-185a-4584-aaff-97f66978e4ea];]=Display name: "ovirt-engine-extension-aaa-jdbc", Extkey[name=EXTENSION_HOME_URL;type=class java.lang.String;uuid=EXTENSION_HOME_URL[4ad7a2f4-f969-42d4-b399-72d192e18304];]=http://www.ovirt.org, Extkey[name=EXTENSION_LOCALE;type=class java.lang.String;uuid=EXTENSION_LOCALE[0780b112-0ce0-404a-b85e-8765d778bb29];]=en_US, Extkey[name=EXTENSION_NAME;type=class java.lang.String;uuid=EXTENSION_NAME[651381d3-f54f-4547-bf28-b0b01a103184];]="ovirt-engine-extension-aaa-jdbc".authn, Extkey[name=EXTENSION_INTERFACE_VERSION_MIN;type=class java.lang.Integer;uuid=EXTENSION_INTERFACE_VERSION_MIN[2b84fc91-305b-497b-a1d7-d961b9d2ce0b];]=0, Extkey[name=EXTENSION_CONFIGURATION;type=class java.util.Properties;uuid=EXTENSION_CONFIGURATION[2d48ab72-f0a1-4312-b4ae-5068a226b0fc];]=***, Extkey[name=EXTENSION_AUTHOR;type=class java.lang.String;uuid=EXTENSION_AUTHOR[ef242f7a-2dad-4bc5-9aad-e07018b7fbcc];]=The oVirt Project, Extkey[name=EXTENSION_INSTANCE_NAME;type=class java.lang.String;uuid=EXTENSION_INSTANCE_NAME[65c67ff6-aeca-4bd5-a245-8674327f011b];]=internal-authn, Extkey[name=EXTENSION_BUILD_INTERFACE_VERSION;type=class java.lang.Integer;uuid=EXTENSION_BUILD_INTERFACE_VERSION[cb479e5a-4b23-46f8-aed3-56a4747a8ab7];]=0, Extkey[name=EXTENSION_CONFIGURATION_SENSITIVE_KEYS;type=interface java.util.Collection;uuid=EXTENSION_CONFIGURATION_SENSITIVE_KEYS[a456efa1-73ff-4204-9f9b-ebff01e35263];]=[], Extkey[name=AAA_AUTHN_CAPABILITIES;type=class java.lang.Long;uuid=AAA_AUTHN_CAPABILITIES[9d16bee3-10fd-46f2-83f9-3d3c54cf258d];]=44, Extkey[name=EXTENSION_GLOBAL_CONTEXT;type=class org.ovirt.engine.api.extensions.ExtMap;uuid=EXTENSION_GLOBAL_CONTEXT[9799e72f-7af6-4cf1-bf08-297bc8903676];]=*skip*, Extkey[name=EXTENSION_VERSION;type=class java.lang.String;uuid=EXTENSION_VERSION[fe35f6a8-8239-4bdb-ab1a-af9f779ce68c];]="1.0.4", Extkey[name=EXTENSION_MANAGER_TRACE_LOG;type=interface org.slf4j.Logger;uuid=EXTENSION_MANAGER_TRACE_LOG[863db666-3ea7-4751-9695-918a3197ad83];]=org.slf4j.impl.Slf4jLogger(org.ovirt.engine.core.extensions.mgr.ExtensionsManager.trace."ovirt-engine-extension-aaa-jdbc".authn.internal-authn), Extkey[name=EXTENSION_PROVIDES;type=interface java.util.Collection;uuid=EXTENSION_PROVIDES[8cf373a6-65b5-4594-b828-0e275087de91];]=[org.ovirt.engine.api.extensions.aaa.Authn], Extkey[name=EXTENSION_CONFIGURATION_FILE;type=class java.lang.String;uuid=EXTENSION_CONFIGURATION_FILE[4fb0ffd3-983c-4f3f-98ff-9660bd67af6a];]=/etc/ovirt-engine/extensions.d/internal-authn.properties}, Extkey[name=AAA_AUTHN_USER;type=class java.lang.String;uuid=AAA_AUTHN_USER[1ceaba26-1bdc-4663-a3c6-5d926f9dd8f0];]=admin, Extkey[name=EXTENSION_INVOKE_COMMAND;type=class org.ovirt.engine.api.extensions.ExtUUID;uuid=EXTENSION_INVOKE_COMMAND[485778ab-bede-4f1a-b823-77b262a2f28d];]=AAA_AUTHN_AUTHENTICATE_CREDENTIALS[d9605c75-6b43-4b00-b32c-06bdfa80244c]}
Output:
{Extkey[name=EXTENSION_INVOKE_RESULT;type=class java.lang.Integer;uuid=EXTENSION_INVOKE_RESULT[0909d91d-8bde-40fb-b6c0-099c772ddd4e];]=2, Extkey[name=EXTENSION_INVOKE_MESSAGE;type=class java.lang.String;uuid=EXTENSION_INVOKE_MESSAGE[b7b053de-dc73-4bf7-9d26-b8bdb72f5893];]=The connection attempt failed.}

	org.ovirt.engine.core.extensions.mgr.ExtensionProxy.invoke(ExtensionProxy.java:91)
	org.ovirt.engine.core.extensions.mgr.ExtensionProxy.invoke(ExtensionProxy.java:109)
	org.ovirt.engine.core.aaa.filters.BasicAuthenticationFilter.handleCredentials(BasicAuthenticationFilter.java:134)
	org.ovirt.engine.core.aaa.filters.BasicAuthenticationFilter.doFilter(BasicAuthenticationFilter.java:84)
	org.ovirt.engine.core.aaa.filters.SessionValidationFilter.doFilter(SessionValidationFilter.java:77)
	org.ovirt.engine.core.aaa.filters.EngineSessionTokenAuthenticationFilter.doFilter(EngineSessionTokenAuthenticationFilter.java:31)
	org.ovirt.engine.core.aaa.filters.RestApiSessionValidationFilter.doFilter(RestApiSessionValidationFilter.java:35)
	org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:111)
	org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:102)
	org.ovirt.engine.api.common.security.CORSSupportFilter.doFilter(CORSSupportFilter.java:183)


* no abnormal performance resources were found. *


there is many exception like this in the engine log:
ERROR [org.ovirt.engine.extension.aaa.jdbc.binding.api.AuthnExtension] (ajp-/127.0.0.1:8702-29) [] Unexpected Exception invoking: AAA_AUTHN_AUTHENTICATE_CREDENTIALS[d9605c75-6b43-4b00-b32c-06bdfa80244c]

and from the server logs:
2016-02-22 09:54:09,712 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/ovirt-engine/api].[org.ovirt.engine.api.restapi.BackendApplication]] (ajp-/127.0.0.1:8702-10) JBWEB000236: Servlet.service() for servlet org.ovirt.engine.api.restapi.BackendApplication threw exception: Class: class org.ovirt.engine.core.extensions.mgr.ExtensionInvokeCommandFailedException

*logs attached*



Version-Release number of selected component (if applicable):
rhevm-3.6.2-0.1.el6.noarch

How reproducible:
unknown, we never faced it before even if using higher scale env. in the same version

Steps to Reproduce:
1. not clear yet, we just ran a rest api load for create & remove VM's, that scenario was tested many times before, for the same version.


Actual results:


Expected results:


Additional info:

Comment 1 Eldad Marciano 2016-02-22 10:50:18 UTC
When httpd restart, engine back to normal.

Comment 2 Eldad Marciano 2016-02-22 10:59:22 UTC
(In reply to Eldad Marciano from comment #1)
> When httpd restart, engine back to normal.

Im bad, back to normal just for very short time, afterwards it becomes unreachable again.

the HotSpotVm is unreachable as well and thread dumps cannot be taken

Comment 3 Oved Ourfali 2016-02-23 06:56:06 UTC
Eldad - Please try to reproduce again, with latest 3.6.3, as this might have been caused by network glitches or something else in the environment. Reducing severity until we see a valid reproducer.

Also, this keeps repeating:
Caused by: java.net.UnknownHostException: intel-brickland-02.lab.eng.rdu.redhat.com

Is your DNS functioning properly? Is that just a host with a wrong FQDN?
I suspect you're experiencing "Bug 1296930 - ovirt-engine fails with "too many open files"

Comment 4 Oved Ourfali 2016-03-04 17:44:08 UTC
Please reopen if reproduces in latest 3.6 build.