Bug 1310590 - [scale] - rhevm unreachable due to 'The connection attempt failed.' [NEEDINFO]
[scale] - rhevm unreachable due to 'The connection attempt failed.'
Status: CLOSED INSUFFICIENT_DATA
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core (Show other bugs)
3.6.2
x86_64 Linux
unspecified Severity high (vote)
: ---
: ---
Assigned To: nobody nobody
Eldad Marciano
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-22 05:15 EST by Eldad Marciano
Modified: 2016-03-06 07:39 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-04 12:44:08 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
oourfali: needinfo?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
engine & server logs (17.55 MB, application/zip)
2016-02-22 05:15 EST, Eldad Marciano
no flags Details

  None (edit)
Description Eldad Marciano 2016-02-22 05:15:39 EST
Created attachment 1129210 [details]
engine & server logs

Description of problem:
some load test running over night.
after ~12 hours, engine web interface found as unreachable, due to The connection attempt failed.

the load test keeps running and has no errors.

when browsing to RHEVM all the JS and CSS files are unreachable due to 500 HTTP response, and the following message has printed '???obrand.welcome.welcome.text??? ???obrand.welcome.version???'

when browsing to RHEVM/api 
the following message has printed:
JBWEB000065: HTTP Status 500 - The connection attempt failed.

JBWEB000309: type JBWEB000066: Exception report

JBWEB000068: message The connection attempt failed.

JBWEB000069: description JBWEB000145: The server encountered an internal error that prevented it from fulfilling this request.

JBWEB000070: exception

Class: class org.ovirt.engine.core.extensions.mgr.ExtensionInvokeCommandFailedException
Input:
{Extkey[name=AAA_AUTHN_CREDENTIALS;type=class java.lang.String;uuid=AAA_AUTHN_CREDENTIALS[03b96485-4bb5-4592-8167-810a5c909706];]=***, Extkey[name=EXTENSION_INVOKE_CONTEXT;type=class org.ovirt.engine.api.extensions.ExtMap;uuid=EXTENSION_INVOKE_CONTEXT[886d2ebb-312a-49ae-9cc3-e1f849834b7d];]={Extkey[name=EXTENSION_INTERFACE_VERSION_MAX;type=class java.lang.Integer;uuid=EXTENSION_INTERFACE_VERSION_MAX[f4cff49f-2717-4901-8ee9-df362446e3e7];]=0, Extkey[name=EXTENSION_LICENSE;type=class java.lang.String;uuid=EXTENSION_LICENSE[8a61ad65-054c-4e31-9c6d-1ca4d60a4c18];]=ASL 2.0, Extkey[name=EXTENSION_NOTES;type=class java.lang.String;uuid=EXTENSION_NOTES[2da5ad7e-185a-4584-aaff-97f66978e4ea];]=Display name: "ovirt-engine-extension-aaa-jdbc", Extkey[name=EXTENSION_HOME_URL;type=class java.lang.String;uuid=EXTENSION_HOME_URL[4ad7a2f4-f969-42d4-b399-72d192e18304];]=http://www.ovirt.org, Extkey[name=EXTENSION_LOCALE;type=class java.lang.String;uuid=EXTENSION_LOCALE[0780b112-0ce0-404a-b85e-8765d778bb29];]=en_US, Extkey[name=EXTENSION_NAME;type=class java.lang.String;uuid=EXTENSION_NAME[651381d3-f54f-4547-bf28-b0b01a103184];]="ovirt-engine-extension-aaa-jdbc".authn, Extkey[name=EXTENSION_INTERFACE_VERSION_MIN;type=class java.lang.Integer;uuid=EXTENSION_INTERFACE_VERSION_MIN[2b84fc91-305b-497b-a1d7-d961b9d2ce0b];]=0, Extkey[name=EXTENSION_CONFIGURATION;type=class java.util.Properties;uuid=EXTENSION_CONFIGURATION[2d48ab72-f0a1-4312-b4ae-5068a226b0fc];]=***, Extkey[name=EXTENSION_AUTHOR;type=class java.lang.String;uuid=EXTENSION_AUTHOR[ef242f7a-2dad-4bc5-9aad-e07018b7fbcc];]=The oVirt Project, Extkey[name=EXTENSION_INSTANCE_NAME;type=class java.lang.String;uuid=EXTENSION_INSTANCE_NAME[65c67ff6-aeca-4bd5-a245-8674327f011b];]=internal-authn, Extkey[name=EXTENSION_BUILD_INTERFACE_VERSION;type=class java.lang.Integer;uuid=EXTENSION_BUILD_INTERFACE_VERSION[cb479e5a-4b23-46f8-aed3-56a4747a8ab7];]=0, Extkey[name=EXTENSION_CONFIGURATION_SENSITIVE_KEYS;type=interface java.util.Collection;uuid=EXTENSION_CONFIGURATION_SENSITIVE_KEYS[a456efa1-73ff-4204-9f9b-ebff01e35263];]=[], Extkey[name=AAA_AUTHN_CAPABILITIES;type=class java.lang.Long;uuid=AAA_AUTHN_CAPABILITIES[9d16bee3-10fd-46f2-83f9-3d3c54cf258d];]=44, Extkey[name=EXTENSION_GLOBAL_CONTEXT;type=class org.ovirt.engine.api.extensions.ExtMap;uuid=EXTENSION_GLOBAL_CONTEXT[9799e72f-7af6-4cf1-bf08-297bc8903676];]=*skip*, Extkey[name=EXTENSION_VERSION;type=class java.lang.String;uuid=EXTENSION_VERSION[fe35f6a8-8239-4bdb-ab1a-af9f779ce68c];]="1.0.4", Extkey[name=EXTENSION_MANAGER_TRACE_LOG;type=interface org.slf4j.Logger;uuid=EXTENSION_MANAGER_TRACE_LOG[863db666-3ea7-4751-9695-918a3197ad83];]=org.slf4j.impl.Slf4jLogger(org.ovirt.engine.core.extensions.mgr.ExtensionsManager.trace."ovirt-engine-extension-aaa-jdbc".authn.internal-authn), Extkey[name=EXTENSION_PROVIDES;type=interface java.util.Collection;uuid=EXTENSION_PROVIDES[8cf373a6-65b5-4594-b828-0e275087de91];]=[org.ovirt.engine.api.extensions.aaa.Authn], Extkey[name=EXTENSION_CONFIGURATION_FILE;type=class java.lang.String;uuid=EXTENSION_CONFIGURATION_FILE[4fb0ffd3-983c-4f3f-98ff-9660bd67af6a];]=/etc/ovirt-engine/extensions.d/internal-authn.properties}, Extkey[name=AAA_AUTHN_USER;type=class java.lang.String;uuid=AAA_AUTHN_USER[1ceaba26-1bdc-4663-a3c6-5d926f9dd8f0];]=admin, Extkey[name=EXTENSION_INVOKE_COMMAND;type=class org.ovirt.engine.api.extensions.ExtUUID;uuid=EXTENSION_INVOKE_COMMAND[485778ab-bede-4f1a-b823-77b262a2f28d];]=AAA_AUTHN_AUTHENTICATE_CREDENTIALS[d9605c75-6b43-4b00-b32c-06bdfa80244c]}
Output:
{Extkey[name=EXTENSION_INVOKE_RESULT;type=class java.lang.Integer;uuid=EXTENSION_INVOKE_RESULT[0909d91d-8bde-40fb-b6c0-099c772ddd4e];]=2, Extkey[name=EXTENSION_INVOKE_MESSAGE;type=class java.lang.String;uuid=EXTENSION_INVOKE_MESSAGE[b7b053de-dc73-4bf7-9d26-b8bdb72f5893];]=The connection attempt failed.}

	org.ovirt.engine.core.extensions.mgr.ExtensionProxy.invoke(ExtensionProxy.java:91)
	org.ovirt.engine.core.extensions.mgr.ExtensionProxy.invoke(ExtensionProxy.java:109)
	org.ovirt.engine.core.aaa.filters.BasicAuthenticationFilter.handleCredentials(BasicAuthenticationFilter.java:134)
	org.ovirt.engine.core.aaa.filters.BasicAuthenticationFilter.doFilter(BasicAuthenticationFilter.java:84)
	org.ovirt.engine.core.aaa.filters.SessionValidationFilter.doFilter(SessionValidationFilter.java:77)
	org.ovirt.engine.core.aaa.filters.EngineSessionTokenAuthenticationFilter.doFilter(EngineSessionTokenAuthenticationFilter.java:31)
	org.ovirt.engine.core.aaa.filters.RestApiSessionValidationFilter.doFilter(RestApiSessionValidationFilter.java:35)
	org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:111)
	org.ovirt.engine.api.common.security.CSRFProtectionFilter.doFilter(CSRFProtectionFilter.java:102)
	org.ovirt.engine.api.common.security.CORSSupportFilter.doFilter(CORSSupportFilter.java:183)


* no abnormal performance resources were found. *


there is many exception like this in the engine log:
ERROR [org.ovirt.engine.extension.aaa.jdbc.binding.api.AuthnExtension] (ajp-/127.0.0.1:8702-29) [] Unexpected Exception invoking: AAA_AUTHN_AUTHENTICATE_CREDENTIALS[d9605c75-6b43-4b00-b32c-06bdfa80244c]

and from the server logs:
2016-02-22 09:54:09,712 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/ovirt-engine/api].[org.ovirt.engine.api.restapi.BackendApplication]] (ajp-/127.0.0.1:8702-10) JBWEB000236: Servlet.service() for servlet org.ovirt.engine.api.restapi.BackendApplication threw exception: Class: class org.ovirt.engine.core.extensions.mgr.ExtensionInvokeCommandFailedException

*logs attached*



Version-Release number of selected component (if applicable):
rhevm-3.6.2-0.1.el6.noarch

How reproducible:
unknown, we never faced it before even if using higher scale env. in the same version

Steps to Reproduce:
1. not clear yet, we just ran a rest api load for create & remove VM's, that scenario was tested many times before, for the same version.


Actual results:


Expected results:


Additional info:
Comment 1 Eldad Marciano 2016-02-22 05:50:18 EST
When httpd restart, engine back to normal.
Comment 2 Eldad Marciano 2016-02-22 05:59:22 EST
(In reply to Eldad Marciano from comment #1)
> When httpd restart, engine back to normal.

Im bad, back to normal just for very short time, afterwards it becomes unreachable again.

the HotSpotVm is unreachable as well and thread dumps cannot be taken
Comment 3 Oved Ourfali 2016-02-23 01:56:06 EST
Eldad - Please try to reproduce again, with latest 3.6.3, as this might have been caused by network glitches or something else in the environment. Reducing severity until we see a valid reproducer.

Also, this keeps repeating:
Caused by: java.net.UnknownHostException: intel-brickland-02.lab.eng.rdu.redhat.com

Is your DNS functioning properly? Is that just a host with a wrong FQDN?
I suspect you're experiencing "Bug 1296930 - ovirt-engine fails with "too many open files"
Comment 4 Oved Ourfali 2016-03-04 12:44:08 EST
Please reopen if reproduces in latest 3.6 build.

Note You need to log in before you can comment on or make changes to this bug.