1403956 – After logging in to the engine, getting the " Operation Canceled Error while executing action: A Request to the Server failed: Unable to evaluate payload" and can't do anything.

Bug 1403956 - After logging in to the engine, getting the " Operation Canceled Error while executing action: A Request to the Server failed: Unable to evaluate payload" and can't do anything.

Summary: After logging in to the engine, getting the " Operation Canceled Error whi...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	Frontend.WebAdmin
Sub Component:
Version:	4.1.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	ovirt-4.1.0-beta
Target Release:	---
Assignee:	Alexander Wels
QA Contact:	Pavel Stehlik
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1264085 1313916 1361511
TreeView+	depends on / blocked

Reported:	2016-12-12 17:21 UTC by Nikolai Sednev
Modified:	2016-12-19 14:39 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-12-19 14:39:01 UTC
oVirt Team:	UX
Embargoed:
Dependent Products:
Flags:	oourfali: ovirt-4.1? nsednev: planning_ack? nsednev: devel_ack? nsednev: testing_ack?

Attachments	(Terms of Use)
Screenshot from 2016-12-12 19-22-22.png (16.75 KB, image/png) 2016-12-12 17:23 UTC, Nikolai Sednev	no flags	Details
Screenshot from 2016-12-12 19-24-00.png (13.28 KB, image/png) 2016-12-12 17:24 UTC, Nikolai Sednev	no flags	Details
engine.log (427.03 KB, text/plain) 2016-12-12 17:26 UTC, Nikolai Sednev	no flags	Details
sosreport from alma04 (8.63 MB, application/x-xz) 2016-12-12 17:31 UTC, Nikolai Sednev	no flags	Details
server.log (341.79 KB, text/plain) 2016-12-13 15:46 UTC, Nikolai Sednev	no flags	Details
ui.log (1.11 KB, text/plain) 2016-12-13 15:46 UTC, Nikolai Sednev	no flags	Details
View All

Description Nikolai Sednev 2016-12-12 17:21:30 UTC

Description of problem:
After logging in to the engine, getting the "	 	 Operation Canceled Error while executing action: A Request to the Server failed: Unable to evaluate payload" and can't do anything.

Version-Release number of selected component (if applicable):
these components on host:
ovirt-engine-appliance-4.1-20161202.1.el7.centos.noarch
ovirt-imageio-common-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch
ovirt-setup-lib-1.1.0-0.0.master.20161107100014.gitb73abeb.el7.centos.noarch
ovirt-hosted-engine-setup-2.1.0-0.0.master.20161130101611.gitb3ad261.el7.centos.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
qemu-kvm-rhev-2.6.0-28.el7_3.2.x86_64
ovirt-host-deploy-1.6.0-0.0.master.20161107121647.gitfd7ddcd.el7.centos.noarch
rhev-release-4.0.6-6-001.noarch
sanlock-3.4.0-1.el7.x86_64
mom-0.5.8-1.el7ev.noarch
ovirt-imageio-daemon-0.5.0-0.201611201242.gitb02532b.el7.centos.noarch
vdsm-4.18.999-1020.git1ff41b1.el7.centos.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-release41-pre-4.1.0-0.0.beta.20161201085255.git731841c.el7.centos.noarch
ovirt-hosted-engine-ha-2.1.0-0.0.master.20161130135331.20161130135328.git3541725.el7.centos.noarch
libvirt-client-2.0.0-10.el7_3.2.x86_64
ovirt-vmconsole-1.0.4-1.el7ev.noarch
Linux version 3.10.0-514.2.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Nov 16 13:15:13 EST 2016
Linux 3.10.0-514.2.2.el7.x86_64 #1 SMP Wed Nov 16 13:15:13 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.3 (Maipo)

Used ovirt-engine-appliance-4.1-20161202.1.el7.centos.noarch for the engine.

How reproducible:
100%

Steps to Reproduce:
1.Deploy HE using ovirt-engine-appliance-4.1-20161202.1.el7.centos.noarch over NFS.
2.Once deployment finished, try logging in to your engine.
3.

Actual results:
Operation Canceled
Error while executing action: A Request to the Server failed: Unable to evaluate payload

Expected results:
Should get normal UI.

Additional info:

Comment 1 Nikolai Sednev 2016-12-12 17:23:42 UTC

Created attachment 1230857 [details]
Screenshot from 2016-12-12 19-22-22.png

Comment 2 Nikolai Sednev 2016-12-12 17:24:27 UTC

Created attachment 1230858 [details]
Screenshot from 2016-12-12 19-24-00.png

Comment 3 Nikolai Sednev 2016-12-12 17:26:37 UTC

Created attachment 1230859 [details]
engine.log

Comment 4 Nikolai Sednev 2016-12-12 17:31:30 UTC

Created attachment 1230873 [details]
sosreport from alma04

Comment 5 Oved Ourfali 2016-12-13 13:02:06 UTC

Please also attach the server.log and ui.log.
Also, what jboss version you're working with?

Comment 6 Nikolai Sednev 2016-12-13 15:46:06 UTC

Created attachment 1231255 [details]
server.log

Comment 7 Nikolai Sednev 2016-12-13 15:46:27 UTC

Created attachment 1231256 [details]
ui.log

Comment 8 Nikolai Sednev 2016-12-13 15:49:53 UTC

(In reply to Oved Ourfali from comment #5)
> Please also attach the server.log and ui.log.
> Also, what jboss version you're working with?

From /var/log/ovirt-engine/boot.log:

19:10:29,183 INFO  [org.jboss.modules] JBoss Modules version 1.5.2.Final
19:10:29,502 INFO  [org.jboss.msc] JBoss MSC version 1.2.6.Final
19:10:29,626 INFO  [org.jboss.as] WFLYSRV0049: WildFly Full 10.1.0.Final (WildFly Core 2.2.0.Final) starting
19:10:31,902 INFO  [org.jboss.as.server.deployment.scanner] WFLYDS0004: Found restapi.war in deployment directory. To trigger deployment create a file called restapi.war.dodeploy
19:10:31,903 INFO  [org.jboss.as.server.deployment.scanner] WFLYDS0004: Found engine.ear in deployment directory. To trigger deployment create a file called engine.ear.dodeploy
19:10:31,903 INFO  [org.jboss.as.server.deployment.scanner] WFLYDS0004: Found apidoc.war in deployment directory. To trigger deployment create a file called apidoc.war.dodeploy
19:10:31,934 INFO  [org.jboss.as.server] WFLYSRV0039: Creating http management service using socket-binding (management)
19:10:31,962 INFO  [org.xnio] XNIO version 3.4.0.Final
19:10:31,973 INFO  [org.xnio.nio] XNIO NIO Implementation Version 3.4.0.Final

Comment 9 Alexander Wels 2016-12-13 16:31:57 UTC

Looks like there is an issue with a database connection?

2016-12-12 19:07:50,310+02 ERROR [org.quartz.impl.jdbcjobstore.JobStoreCMT] (QuartzScheduler_QuartzOvirtDBScheduler-NON_CLUSTERED_MisfireHandler) MisfireHandler: Error handling misfires: Failed to obtain DB connection from data source 'NMEngineDS': java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException: javax.resource.ResourceException: IJ000470: You are trying to use a connection factory that has been shut down: java:/ENGINEDataSourceNoJTA: org.quartz.JobPersistenceException: Failed to obtain DB connection from data source 'NMEngineDS': java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException: javax.resource.ResourceException: IJ000470: You are trying to use a connection factory that has been shut down: java:/ENGINEDataSourceNoJTA [See nested exception: java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException: javax.resource.ResourceException: IJ000470: You are trying to use a connection factory that has been shut down: java:/ENGINEDataSourceNoJTA]
	at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:168) [quartz.jar:]
	at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3169) [quartz.jar:]
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3946) [quartz.jar:]
	at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3967) [quartz.jar:]
Caused by: java.sql.SQLException: Could not retrieve datasource via JNDI url 'java:/ENGINEDataSourceNoJTA' java.sql.SQLException: javax.resource.ResourceException: IJ000470: You are trying to use a connection factory that has been shut down: java:/ENGINEDataSourceNoJTA
	at org.quartz.utils.JNDIConnectionProvider.getConnection(JNDIConnectionProvider.java:163) [quartz.jar:]
	at org.quartz.utils.DBConnectionManager.getConnection(DBConnectionManager.java:108) [quartz.jar:]
	at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:165) [quartz.jar:]
	... 3 more

Comment 10 Alexander Wels 2016-12-13 16:32:54 UTC

Are you sure you configured everything correctly? Looking at the database connection issue, looks like a huge problem and everything else comes from that.

Comment 11 Nikolai Sednev 2016-12-14 07:46:09 UTC

(In reply to Alexander Wels from comment #10)
> Are you sure you configured everything correctly? Looking at the database
> connection issue, looks like a huge problem and everything else comes from
> that.

Simple deployment on fresh environment.
I've used repos from Mojo for host and http://www.ovirt.org/release/4.1.0/ for upstream engine. This flow of deployment worked correctly before. BTW, I'm getting in to the user portal without any issue, the only affected part here is a webadmin portal.

Comment 12 Oved Ourfali 2016-12-14 07:55:11 UTC

Can you try to restart the DB and the engine?

Comment 13 Oved Ourfali 2016-12-14 07:55:47 UTC

Sandro - can you also take a look at the HE deployment?

Comment 14 Nikolai Sednev 2016-12-14 09:01:51 UTC

(In reply to Oved Ourfali from comment #12)
> Can you try to restart the DB and the engine?

I've tried to restart the whole environment several times already without any effect.

Comment 15 Oved Ourfali 2016-12-14 09:04:14 UTC

Restoring needinfo on Sandro.

Comment 16 Simone Tiraboschi 2016-12-14 10:01:13 UTC

From hosted-engine-setup point of view, everything here worked as expected.

2016-12-12 19:04:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND                 |- [ INFO  ] Stage: Termination
2016-12-12 19:04:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND                 |- [ INFO  ] Execution of setup completed successfully
2016-12-12 19:04:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND                 |- HE_APPLIANCE_ENGINE_SETUP_SUCCESS
...
2016-12-12 19:08:01 INFO otopi.plugins.gr_he_common.core.misc misc._terminate:188 Hosted Engine successfully deployed
2016-12-12 19:08:01 DEBUG otopi.context context._executeMethod:128 Stage terminate METHOD otopi.plugins.otopi.dialog.human.Plugin._terminate
2016-12-12 19:08:02 DEBUG otopi.context context._executeMethod:128 Stage terminate METHOD otopi.plugins.otopi.dialog.machine.Plugin._terminate
2016-12-12 19:08:02 DEBUG otopi.context context._executeMethod:134 condition False
2016-12-12 19:08:02 DEBUG otopi.context context._executeMethod:128 Stage terminate METHOD otopi.plugins.otopi.core.log.Plugin._terminate

Comment 17 Simone Tiraboschi 2016-12-14 10:09:57 UTC

On my opinion the issue is due to the fact that for some reason at a certain point the DBMS died:

LOG:  database system was interrupted; last known up at 2016-12-14 10:16:09 IST
FATAL:  the database system is starting up
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 0/180CD728
LOG:  record with zero length at 0/1812C488
LOG:  redo done at 0/1812C458
LOG:  last completed transaction was at log time 2016-12-14 10:17:29.59582+02
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections

Comment 18 Nikolai Sednev 2016-12-14 10:18:17 UTC

After https://bugzilla.redhat.com/show_bug.cgi?id=1403956#c14 I've purged my Firefox cache and then got logged in normally to WEBADMIN portal. But still need to investigate why we've hit the https://bugzilla.redhat.com/show_bug.cgi?id=1403956#c17 after successful HE deployment, which passed OK as I've seen it via CLI.

Comment 19 Alexander Wels 2016-12-19 13:52:11 UTC

Okay so I am going to close this as not a bug as clearing the cache fixed it.

Comment 20 Nikolai Sednev 2016-12-19 14:32:38 UTC

(In reply to Alexander Wels from comment #19)
> Okay so I am going to close this as not a bug as clearing the cache fixed it.

Why closing the bug?! Look at the comments 16 and 17, the DBMS died.
This should be investigated using logs and resolved.

I'm returning this bug to assigned.

Comment 21 Oved Ourfali 2016-12-19 14:39:01 UTC

The issue with the cache has nothing to do with the DB.
For some reason your DB was down, and we don't have any way to know why.
If it reproduces please re-open and give us access to the environment immediately.

Note You need to log in before you can comment on or make changes to this bug.