Bug 1292544 - [Upgrade][Engine] ovirt-engine does not starts after upgrading 3.5->3.6
Summary: [Upgrade][Engine] ovirt-engine does not starts after upgrading 3.5->3.6
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Setup.Engine
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ovirt-3.6.3
: ---
Assignee: Yedidyah Bar David
QA Contact: Pavel Stehlik
URL:
Whiteboard: integration
Depends On:
Blocks: RHEV3.6Upgrade
TreeView+ depends on / blocked
 
Reported: 2015-12-17 17:58 UTC by Gil Klein
Modified: 2019-12-16 05:16 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-25 12:59:06 UTC
oVirt Team: Integration
Embargoed:
gklein: ovirt-3.6.z?
sbonazzo: blocker?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
log collector (9.12 MB, application/x-xz)
2015-12-17 17:58 UTC, Gil Klein
no flags Details
yum.log (48.40 KB, text/plain)
2015-12-17 18:03 UTC, Gil Klein
no flags Details
messages (158.97 KB, text/plain)
2015-12-17 18:03 UTC, Gil Klein
no flags Details
ovirt-engine startup trace requested by didi (6.37 MB, application/x-gzip)
2016-01-25 07:21 UTC, eric.wong
no flags Details
strace until error in /var/log/messages (12.27 MB, application/x-gzip)
2016-01-25 08:05 UTC, eric.wong
no flags Details
contents of /var/log/ovirt-engine/setup/* (867.38 KB, application/x-gzip)
2016-01-25 08:48 UTC, eric.wong
no flags Details
ovirt-engine.xml.in (13.83 KB, application/xml)
2016-01-25 08:51 UTC, eric.wong
no flags Details

Description Gil Klein 2015-12-17 17:58:07 UTC
Created attachment 1106798 [details]
log collector

Description of problem:
After upgrading the engine from 3.5 -> 3.6 it failed to start. 
 

Version-Release number of selected component (if applicable):


How reproducible:
On one system


Steps to Reproduce:
1. Install 3.5 engine
2. Install DWH + reports
3. Setup the engine 
4. Add 3.6 repos
5. yum update "rhevm-setup*"
6. engine-setup


Actual results:
engine does not start after the upgrade finishes, server.log and engine.log are empty

/var/log/messages shows:

Dec 17 19:31:32 jenkins-vm-02 2015-12-17 19:31:32,419 ovirt-engine: ERROR run:532 Error: process terminated with status code 1


Expected results:
engine should start after a system upgrade


Additional info:
log collector results are attached

Comment 1 Gil Klein 2015-12-17 18:02:00 UTC
Components versions via yum.log

Dec 16 15:28:07 Installed: rhevm-3.5.7-0.1.el6ev.noarch
Dec 17 17:30:14 Installed: rhevm-dwh-setup-3.5.5-1.el6ev.noarch
Dec 17 17:30:15 Installed: rhevm-dwh-3.5.5-1.el6ev.noarch
Dec 17 17:31:12 Installed: rhevm-reports-setup-3.5.5-2.el6ev.noarch
Dec 17 17:31:15 Installed: rhevm-reports-3.5.5-2.el6ev.noarch
Dec 17 18:49:42 Updated: rhevm-reports-setup-3.6.1.1-1.el6ev.noarch
Dec 17 18:49:43 Updated: rhevm-dwh-setup-3.6.1-1.el6ev.noarch
Dec 17 19:03:57 Updated: rhevm-3.6.1.3-0.1.el6.noarch
Dec 17 19:03:59 Updated: rhevm-reports-3.6.1.1-1.el6ev.noarch
Dec 17 19:04:00 Updated: rhevm-dwh-3.6.1-1.el6ev.noarch

Comment 2 Gil Klein 2015-12-17 18:03:20 UTC
Created attachment 1106800 [details]
yum.log

Comment 3 Gil Klein 2015-12-17 18:03:58 UTC
Created attachment 1106801 [details]
messages

Comment 4 Yedidyah Bar David 2015-12-20 10:31:10 UTC
Did you have good logs with 3.5? no old logs in the attached sosreport.

If this is repeatable, please try starting the engine under strace and attach its log.

Comment 5 Oved Ourfali 2015-12-21 08:58:33 UTC
Moving to integration in case this is a setup issue.
Didi - if it turns out to be infra-related, please let me know.

Comment 6 Yedidyah Bar David 2016-01-14 08:31:25 UTC
I connected to the machine where this happened and it seems to have been reinstalled since. Closing for now. Please reopen if it happens again.

Leaving needinfo on Gil in case he wants to comment.

Comment 7 Gil Klein 2016-01-14 11:03:05 UTC
(In reply to Yedidyah Bar David from comment #6)
> I connected to the machine where this happened and it seems to have been
> reinstalled since. Closing for now. Please reopen if it happens again.
> 
> Leaving needinfo on Gil in case he wants to comment.
I'll try to reproduce, and reopen if it happen again.

Comment 8 eric.wong 2016-01-25 07:19:37 UTC
Seem to run into this same issue again.  After upgrading from 3.5 to 3.6.1, ovirt-engine cannot start.  server.log & engine.log both empty.  Only logged event on the system is in /var/log/messages:

 ovirt-engine: ERROR run:532 Error: process terminated with status code 1

system os: CentOS 6.5
selinux: permissive

Comment 9 eric.wong 2016-01-25 07:21:26 UTC
Created attachment 1117829 [details]
ovirt-engine startup trace requested by didi

Comment 10 Yedidyah Bar David 2016-01-25 07:41:49 UTC
Did you get again this error on /var/log/messages?

I think the attached strace still has the engine starting (or trying to), before failing.

Please attach again an strace trace, but keep running strace until you see this error in /var/log/messages, not merely until the shell prompt returns. Thanks!

Comment 11 eric.wong 2016-01-25 08:05:09 UTC
Created attachment 1117833 [details]
strace until error in /var/log/messages

David:

Please find the new strace I did.  This strace is terminated until I saw the error messages in syslog.

Thanks,
Eric

Comment 12 Yedidyah Bar David 2016-01-25 08:29:10 UTC
This strace has:

28787 07:50:55.571717 write(8, "07:50:55,490 ERROR [org.jboss.as.server] JBAS015956: Caught exception during boot: org.jboss.as.controller.persistence.ConfigurationPersistenceException: JBAS014676: Failed to parse configuration
        at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:112) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.server.ServerService.boot(ServerService.java:331) [wildfly-server-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:259) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51]
Caused by: javax.xml.stream.XMLStreamException: JBAS014674: Failed to load module org.jboss.as.web
        at org.jboss.as.controller.parsing.ExtensionXml.parseExtensions(ExtensionXml.java:155) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.server.parsing.StandaloneXml.readServerElement_1_1(StandaloneXml.java:325) [wildfly-server-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:137) [wildfly-server-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:107) [wildfly-server-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.staxmapper.XMLMapperImpl.processNested(XMLMapperImpl.java:110) [staxmapper-1.1.0.Final.jar:1.1.0.Final]
        at org.jboss.staxmapper.XMLMapperImpl.parseDocument(XMLMapperImpl.java:69) [staxmapper-1.1.0.Final.jar:1.1.0.Final]
        at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:104) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        ... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: JBAS013452: Legacy extension 'org.jboss.as.web' is not supported on servers running this version. The extension is only supported for use by hosts running a previous release in a mixed-version managed domain
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) [rt.jar:1.7.0_51]
        at java.util.concurrent.FutureTask.get(FutureTask.java:188) [rt.jar:1.7.0_51]
        at org.jboss.as.controller.parsing.ExtensionXml.parseExtensions(ExtensionXml.java:147) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        ... 9 more
Caused by: java.lang.UnsupportedOperationException: JBAS013452: Legacy extension 'org.jboss.as.web' is not supported on servers running this version. The extension is only supported for use by hosts running a previous release in a mixed-version managed domain
        at org.jboss.as.controller.extension.AbstractLegacyExtension.initializeParsers(AbstractLegacyExtension.java:82) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.controller.parsing.ExtensionXml.loadModule(ExtensionXml.java:183) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.controller.parsing.ExtensionXml.access$000(ExtensionXml.java:69) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.controller.parsing.ExtensionXml$1.call(ExtensionXml.java:127) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at org.jboss.as.controller.parsing.ExtensionXml$1.call(ExtensionXml.java:124) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51]
        at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51]
        at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final.jar:2.1.1.Final]

", 3699 <unfinished ...>

Comment 13 Yedidyah Bar David 2016-01-25 08:36:52 UTC
Eric: Please attach:

/var/log/ovirt-engine/setup/*
/usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in
/var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml

Oved: Any idea?

Comment 14 eric.wong 2016-01-25 08:48:58 UTC
Created attachment 1117860 [details]
contents of /var/log/ovirt-engine/setup/*

Comment 15 eric.wong 2016-01-25 08:51:00 UTC
Created attachment 1117862 [details]
ovirt-engine.xml.in

as for /var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml, I do not even find jboss_runtime under /var/lib/ovirt-engine.

[08:50:34 root.internal:/var/lib/ovirt-engine]
# ll
total 704
drwxr-x---.  2 root  root    4096 Jan 25 06:03 backups
drwxr-xr-x.  2 ovirt ovirt   4096 Dec 10 18:12 content
drwxr-xr-x.  2 ovirt ovirt   4096 Apr  9  2014 deployments
-rw-r--r--.  1 ovirt ovirt      6 Jan 25 07:50 engine.up
-rw-r--r--.  1 ovirt ovirt     32 Apr  9  2014 external_truststore
-rw-r--r--.  1 root  root   18259 Apr 10  2014 reports.xml
drwxr-xr-x.  3 root  root    4096 Dec 10 18:12 setup
-rw-r--r--.  1 root  root  660573 Jul 23  2015 setup-history.txt
drwxr-xr-x. 10 ovirt ovirt   4096 Apr  9  2014 timer-service-data
drwxr-xr-x.  3 ovirt ovirt   4096 Apr  9  2014 tx-object-store

Comment 16 eric.wong 2016-01-25 08:59:45 UTC
Just uploaded the sos report to my DropBox.  Note that query to REST API failed because ovirt is not running.

https://dl.dropboxusercontent.com/u/16707247/sosreport-LogCollector-20160125081534.tar.xz.  

Thanks,
Eric

Comment 17 Yedidyah Bar David 2016-01-25 10:06:17 UTC
(In reply to eric.wong from comment #15)
> Created attachment 1117862 [details]
> ovirt-engine.xml.in
> 
> as for /var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml, I do not
> even find jboss_runtime under /var/lib/ovirt-engine.

That's expected, this directory is created by engine start and removed on stop.
Just wanted to make sure.

However, your ovirt-engine.xml.in is a bit weird. Any chance you modified it before or after the upgrade?

Comment 18 eric.wong 2016-01-25 10:08:10 UTC
No.  I would not dare to hand edit any files in ovirt.

Comment 19 Yedidyah Bar David 2016-01-25 10:27:31 UTC
(In reply to eric.wong from comment #18)
> No.  I would not dare to hand edit any files in ovirt.

What's the output of:

rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in
ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in*
rpm -qi ovirt-engine-backend

? Thanks.

Comment 20 eric.wong 2016-01-25 10:29:45 UTC
[08:55:24 root.internal:/var/lib/ovirt-engine]
# rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in
S.5....T.    /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in


[10:28:36 root.internal:/var/lib/ovirt-engine]
# ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in*
-rw-r--r--. 1 root root 14158 Jan 21 20:28 /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in



[10:28:50 root.internal:/var/lib/ovirt-engine]
# rpm -qi ovirt-engine-backend
Name        : ovirt-engine-backend         Relocations: (not relocatable)
Version     : 3.6.1.3                           Vendor: (none)
Release     : 1.el6                         Build Date: Thu 10 Dec 2015 06:13:17 PM UTC
Install Date: Thu 21 Jan 2016 01:32:11 AM UTC      Build Host: fc21-vm13.phx.ovirt.org
Group       : Virtualization/Management     Source RPM: ovirt-engine-3.6.1.3-1.el6.src.rpm
Size        : 15204622                         License: ASL 2.0
Signature   : RSA/SHA1, Fri 11 Dec 2015 07:24:10 AM UTC, Key ID ab8c4f9dfe590cb7
URL         : http://www.ovirt.org
Summary     : Engine core of oVirt Engine
Description :
The backend engine of oVirt Engine

Comment 21 Yedidyah Bar David 2016-01-25 10:48:06 UTC
(In reply to eric.wong from comment #20)
> [08:55:24 root.internal:/var/lib/ovirt-engine]
> # rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in
> S.5....T.   
> /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in

This means that it actually was changed. If not by you personally, then by someone else who has access to this machine (your team/organization/puppet/whatever).

> 
> 
> [10:28:36 root.internal:/var/lib/ovirt-engine]
> # ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in*
> -rw-r--r--. 1 root root 14158 Jan 21 20:28
> /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in
> 

And this is the (last) time this happened.

> 
> 
> [10:28:50 root.internal:/var/lib/ovirt-engine]
> # rpm -qi ovirt-engine-backend
> Name        : ovirt-engine-backend         Relocations: (not relocatable)
> Version     : 3.6.1.3                           Vendor: (none)
> Release     : 1.el6                         Build Date: Thu 10 Dec 2015

It should have had this timestamp.

> 06:13:17 PM UTC
> Install Date: Thu 21 Jan 2016 01:32:11 AM UTC      Build Host:
> fc21-vm13.phx.ovirt.org
> Group       : Virtualization/Management     Source RPM:
> ovirt-engine-3.6.1.3-1.el6.src.rpm
> Size        : 15204622                         License: ASL 2.0
> Signature   : RSA/SHA1, Fri 11 Dec 2015 07:24:10 AM UTC, Key ID
> ab8c4f9dfe590cb7
> URL         : http://www.ovirt.org
> Summary     : Engine core of oVirt Engine
> Description :
> The backend engine of oVirt Engine

Looking at the file, it seems you wanted to change something about the logging.

I suggest to check the original file from the package and adapt accordingly.

Or better, consider using ovirt-engine-extension-logger-log4j. See e.g. this:

http://lists.ovirt.org/pipermail/users/2016-January/037172.html

Comment 22 eric.wong 2016-01-25 10:53:09 UTC
Hmmm. So will re-run engine-setup get us the original file back

Comment 23 Yedidyah Bar David 2016-01-25 11:00:34 UTC
(In reply to eric.wong from comment #22)
> Hmmm. So will re-run engine-setup get us the original file back

I think it should be enough to:

yum reinstall ovirt-engine-backend

'yum update' will not work, because engine-setup adds it to versionlock. But reinstall should work.

Comment 24 Yedidyah Bar David 2016-01-25 11:01:49 UTC
Of course, if you have something automatic changing it, you'll have to also handle that one.

Comment 25 eric.wong 2016-01-25 11:56:57 UTC
Thanks.  reinstall ovirt-engine-backend & restart ovirt-engine twice.  now we can get back into the GUI once again.  I need to trace what caused the problem first place.  But will do so tomorrow.  long day.  Thank you guys for jumping onto our issue right away.  Very happy with you guys :)

Comment 26 Yedidyah Bar David 2016-01-25 12:58:49 UTC
Very well. Closing for now. Please reopen if relevant. Thanks!


Note You need to log in before you can comment on or make changes to this bug.