Created attachment 1106798 [details] log collector Description of problem: After upgrading the engine from 3.5 -> 3.6 it failed to start. Version-Release number of selected component (if applicable): How reproducible: On one system Steps to Reproduce: 1. Install 3.5 engine 2. Install DWH + reports 3. Setup the engine 4. Add 3.6 repos 5. yum update "rhevm-setup*" 6. engine-setup Actual results: engine does not start after the upgrade finishes, server.log and engine.log are empty /var/log/messages shows: Dec 17 19:31:32 jenkins-vm-02 2015-12-17 19:31:32,419 ovirt-engine: ERROR run:532 Error: process terminated with status code 1 Expected results: engine should start after a system upgrade Additional info: log collector results are attached
Components versions via yum.log Dec 16 15:28:07 Installed: rhevm-3.5.7-0.1.el6ev.noarch Dec 17 17:30:14 Installed: rhevm-dwh-setup-3.5.5-1.el6ev.noarch Dec 17 17:30:15 Installed: rhevm-dwh-3.5.5-1.el6ev.noarch Dec 17 17:31:12 Installed: rhevm-reports-setup-3.5.5-2.el6ev.noarch Dec 17 17:31:15 Installed: rhevm-reports-3.5.5-2.el6ev.noarch Dec 17 18:49:42 Updated: rhevm-reports-setup-3.6.1.1-1.el6ev.noarch Dec 17 18:49:43 Updated: rhevm-dwh-setup-3.6.1-1.el6ev.noarch Dec 17 19:03:57 Updated: rhevm-3.6.1.3-0.1.el6.noarch Dec 17 19:03:59 Updated: rhevm-reports-3.6.1.1-1.el6ev.noarch Dec 17 19:04:00 Updated: rhevm-dwh-3.6.1-1.el6ev.noarch
Created attachment 1106800 [details] yum.log
Created attachment 1106801 [details] messages
Did you have good logs with 3.5? no old logs in the attached sosreport. If this is repeatable, please try starting the engine under strace and attach its log.
Moving to integration in case this is a setup issue. Didi - if it turns out to be infra-related, please let me know.
I connected to the machine where this happened and it seems to have been reinstalled since. Closing for now. Please reopen if it happens again. Leaving needinfo on Gil in case he wants to comment.
(In reply to Yedidyah Bar David from comment #6) > I connected to the machine where this happened and it seems to have been > reinstalled since. Closing for now. Please reopen if it happens again. > > Leaving needinfo on Gil in case he wants to comment. I'll try to reproduce, and reopen if it happen again.
Seem to run into this same issue again. After upgrading from 3.5 to 3.6.1, ovirt-engine cannot start. server.log & engine.log both empty. Only logged event on the system is in /var/log/messages: ovirt-engine: ERROR run:532 Error: process terminated with status code 1 system os: CentOS 6.5 selinux: permissive
Created attachment 1117829 [details] ovirt-engine startup trace requested by didi
Did you get again this error on /var/log/messages? I think the attached strace still has the engine starting (or trying to), before failing. Please attach again an strace trace, but keep running strace until you see this error in /var/log/messages, not merely until the shell prompt returns. Thanks!
Created attachment 1117833 [details] strace until error in /var/log/messages David: Please find the new strace I did. This strace is terminated until I saw the error messages in syslog. Thanks, Eric
This strace has: 28787 07:50:55.571717 write(8, "07:50:55,490 ERROR [org.jboss.as.server] JBAS015956: Caught exception during boot: org.jboss.as.controller.persistence.ConfigurationPersistenceException: JBAS014676: Failed to parse configuration at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:112) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.server.ServerService.boot(ServerService.java:331) [wildfly-server-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:259) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] Caused by: javax.xml.stream.XMLStreamException: JBAS014674: Failed to load module org.jboss.as.web at org.jboss.as.controller.parsing.ExtensionXml.parseExtensions(ExtensionXml.java:155) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.server.parsing.StandaloneXml.readServerElement_1_1(StandaloneXml.java:325) [wildfly-server-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:137) [wildfly-server-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.server.parsing.StandaloneXml.readElement(StandaloneXml.java:107) [wildfly-server-8.2.1.Final.jar:8.2.1.Final] at org.jboss.staxmapper.XMLMapperImpl.processNested(XMLMapperImpl.java:110) [staxmapper-1.1.0.Final.jar:1.1.0.Final] at org.jboss.staxmapper.XMLMapperImpl.parseDocument(XMLMapperImpl.java:69) [staxmapper-1.1.0.Final.jar:1.1.0.Final] at org.jboss.as.controller.persistence.XmlConfigurationPersister.load(XmlConfigurationPersister.java:104) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] ... 3 more Caused by: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: JBAS013452: Legacy extension 'org.jboss.as.web' is not supported on servers running this version. The extension is only supported for use by hosts running a previous release in a mixed-version managed domain at java.util.concurrent.FutureTask.report(FutureTask.java:122) [rt.jar:1.7.0_51] at java.util.concurrent.FutureTask.get(FutureTask.java:188) [rt.jar:1.7.0_51] at org.jboss.as.controller.parsing.ExtensionXml.parseExtensions(ExtensionXml.java:147) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] ... 9 more Caused by: java.lang.UnsupportedOperationException: JBAS013452: Legacy extension 'org.jboss.as.web' is not supported on servers running this version. The extension is only supported for use by hosts running a previous release in a mixed-version managed domain at org.jboss.as.controller.extension.AbstractLegacyExtension.initializeParsers(AbstractLegacyExtension.java:82) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.controller.parsing.ExtensionXml.loadModule(ExtensionXml.java:183) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.controller.parsing.ExtensionXml.access$000(ExtensionXml.java:69) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.controller.parsing.ExtensionXml$1.call(ExtensionXml.java:127) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at org.jboss.as.controller.parsing.ExtensionXml$1.call(ExtensionXml.java:124) [wildfly-controller-8.2.1.Final.jar:8.2.1.Final] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final.jar:2.1.1.Final] ", 3699 <unfinished ...>
Eric: Please attach: /var/log/ovirt-engine/setup/* /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in /var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml Oved: Any idea?
Created attachment 1117860 [details] contents of /var/log/ovirt-engine/setup/*
Created attachment 1117862 [details] ovirt-engine.xml.in as for /var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml, I do not even find jboss_runtime under /var/lib/ovirt-engine. [08:50:34 root.internal:/var/lib/ovirt-engine] # ll total 704 drwxr-x---. 2 root root 4096 Jan 25 06:03 backups drwxr-xr-x. 2 ovirt ovirt 4096 Dec 10 18:12 content drwxr-xr-x. 2 ovirt ovirt 4096 Apr 9 2014 deployments -rw-r--r--. 1 ovirt ovirt 6 Jan 25 07:50 engine.up -rw-r--r--. 1 ovirt ovirt 32 Apr 9 2014 external_truststore -rw-r--r--. 1 root root 18259 Apr 10 2014 reports.xml drwxr-xr-x. 3 root root 4096 Dec 10 18:12 setup -rw-r--r--. 1 root root 660573 Jul 23 2015 setup-history.txt drwxr-xr-x. 10 ovirt ovirt 4096 Apr 9 2014 timer-service-data drwxr-xr-x. 3 ovirt ovirt 4096 Apr 9 2014 tx-object-store
Just uploaded the sos report to my DropBox. Note that query to REST API failed because ovirt is not running. https://dl.dropboxusercontent.com/u/16707247/sosreport-LogCollector-20160125081534.tar.xz. Thanks, Eric
(In reply to eric.wong from comment #15) > Created attachment 1117862 [details] > ovirt-engine.xml.in > > as for /var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine.xml, I do not > even find jboss_runtime under /var/lib/ovirt-engine. That's expected, this directory is created by engine start and removed on stop. Just wanted to make sure. However, your ovirt-engine.xml.in is a bit weird. Any chance you modified it before or after the upgrade?
No. I would not dare to hand edit any files in ovirt.
(In reply to eric.wong from comment #18) > No. I would not dare to hand edit any files in ovirt. What's the output of: rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in* rpm -qi ovirt-engine-backend ? Thanks.
[08:55:24 root.internal:/var/lib/ovirt-engine] # rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in S.5....T. /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in [10:28:36 root.internal:/var/lib/ovirt-engine] # ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in* -rw-r--r--. 1 root root 14158 Jan 21 20:28 /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in [10:28:50 root.internal:/var/lib/ovirt-engine] # rpm -qi ovirt-engine-backend Name : ovirt-engine-backend Relocations: (not relocatable) Version : 3.6.1.3 Vendor: (none) Release : 1.el6 Build Date: Thu 10 Dec 2015 06:13:17 PM UTC Install Date: Thu 21 Jan 2016 01:32:11 AM UTC Build Host: fc21-vm13.phx.ovirt.org Group : Virtualization/Management Source RPM: ovirt-engine-3.6.1.3-1.el6.src.rpm Size : 15204622 License: ASL 2.0 Signature : RSA/SHA1, Fri 11 Dec 2015 07:24:10 AM UTC, Key ID ab8c4f9dfe590cb7 URL : http://www.ovirt.org Summary : Engine core of oVirt Engine Description : The backend engine of oVirt Engine
(In reply to eric.wong from comment #20) > [08:55:24 root.internal:/var/lib/ovirt-engine] > # rpm -qfV /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in > S.5....T. > /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in This means that it actually was changed. If not by you personally, then by someone else who has access to this machine (your team/organization/puppet/whatever). > > > [10:28:36 root.internal:/var/lib/ovirt-engine] > # ls -l /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in* > -rw-r--r--. 1 root root 14158 Jan 21 20:28 > /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.xml.in > And this is the (last) time this happened. > > > [10:28:50 root.internal:/var/lib/ovirt-engine] > # rpm -qi ovirt-engine-backend > Name : ovirt-engine-backend Relocations: (not relocatable) > Version : 3.6.1.3 Vendor: (none) > Release : 1.el6 Build Date: Thu 10 Dec 2015 It should have had this timestamp. > 06:13:17 PM UTC > Install Date: Thu 21 Jan 2016 01:32:11 AM UTC Build Host: > fc21-vm13.phx.ovirt.org > Group : Virtualization/Management Source RPM: > ovirt-engine-3.6.1.3-1.el6.src.rpm > Size : 15204622 License: ASL 2.0 > Signature : RSA/SHA1, Fri 11 Dec 2015 07:24:10 AM UTC, Key ID > ab8c4f9dfe590cb7 > URL : http://www.ovirt.org > Summary : Engine core of oVirt Engine > Description : > The backend engine of oVirt Engine Looking at the file, it seems you wanted to change something about the logging. I suggest to check the original file from the package and adapt accordingly. Or better, consider using ovirt-engine-extension-logger-log4j. See e.g. this: http://lists.ovirt.org/pipermail/users/2016-January/037172.html
Hmmm. So will re-run engine-setup get us the original file back
(In reply to eric.wong from comment #22) > Hmmm. So will re-run engine-setup get us the original file back I think it should be enough to: yum reinstall ovirt-engine-backend 'yum update' will not work, because engine-setup adds it to versionlock. But reinstall should work.
Of course, if you have something automatic changing it, you'll have to also handle that one.
Thanks. reinstall ovirt-engine-backend & restart ovirt-engine twice. now we can get back into the GUI once again. I need to trace what caused the problem first place. But will do so tomorrow. long day. Thank you guys for jumping onto our issue right away. Very happy with you guys :)
Very well. Closing for now. Please reopen if relevant. Thanks!