Bug 889217 - RHEV-M 3.1 instance died after RHSA-2012:1592-1 installation
Summary: RHEV-M 3.1 instance died after RHSA-2012:1592-1 installation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-setup
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.2.0
Assignee: Juan Hernández
QA Contact: Martin Pavlik
URL:
Whiteboard: infra
Depends On:
Blocks: 891632 915537
TreeView+ depends on / blocked
 
Reported: 2012-12-20 14:35 UTC by Petr Spacek
Modified: 2016-02-10 19:33 UTC (History)
18 users (show)

Fixed In Version: sf-5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 891632 (view as bug list)
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker MODULES-105 0 Major Resolved Package indexes older than the jar should be ignored 2016-11-17 16:05:15 UTC

Description Petr Spacek 2012-12-20 14:35:17 UTC
Description of problem:
Our RHEV-M 3.1 instance died after RHSA-2012:1592-1 installation (+ update with some other packages like kernel). I did yum update and then "reboot". All web pages produced by RHEV-M are blank from the reboot, 404 is being returned. RHEV-M is unusable.

Logs point to some database issue but there is absolutely nothing related to postgresql in /var/log/messages

Version-Release number of selected component (if applicable):
rhevm-3.1.0-32.el6ev.noarch

How reproducible:
?

Steps to Reproduce (in my case):
1. yum update
2. reboot
  
Actual results:
wget 'https://rhevm.example.com/' --no-check-certificate
2012-12-20 15:31:16 ERROR 404: Not Found.


Expected results:
RHEV-M survived this update :-)

Comment 2 Juan Hernández 2012-12-20 19:59:12 UTC
The relevant error message in this case is this one:

2012-12-20 20:10:16,887 ERROR [org.jboss.as.server.deployment] (MSC service thread 1-1) JBAS015892: Deployment unit processor org.jboss.as.web.deployment.ELExpressionFactoryProcessor@6ca85bcc unexpectedly threw an exception during undeploy phase POST_MODULE of deployment "engine.ear": java.lang.NoClassDefFoundError: org/jboss/el/cache/FactoryFinderCache

That missing class is part of the jboss-el-api_2.2_spec-1.0.2-2.Final_redhat_1.ep6.el6, but is not part of the previous version of the package, and the index file that jboss generates and uses to speed class loading has not been updated:

# ls -l /usr/share/jbossas/modules/javax/el/api/main/jboss-el-api_2.2_spec.jar.index
-rw-r--r--. 1 jbossas jbossas 140 Dec 20 20:33 /usr/share/jbossas/modules/javax/el/api/main/jboss-el-api_2.2_spec.jar.index
# cat /usr/share/jbossas/modules/javax/el/api/main/jboss-el-api_2.2_spec.jar.index

META-INF/maven
javax
javax/el
META-INF/maven/org.jboss.spec.javax.el
META-INF/maven/org.jboss.spec.javax.el/jboss-el-api_2.2_spec
META-INF

These files are not updated because when the jboss package was updated because they are not part of the package, they are generated when the jboss service is started. But we don't start the jboss service, but the ovirt-engine service, and this runs as user ovirt and has no permission to write to /usr/share/jbossas/modules.

Users that have started the jbossas service will have this same problem.

The workaround to the problem is to remove those index files and restart the engine:

# find /usr/share/jbossas/modules -name '*.jar.index' -delete
# service ovirt-engine restart

To fix this en ovirt-engine we can modify the service script to remove or update these files before starting, when it is still running as root.

Comment 3 Juan Hernández 2012-12-20 21:50:06 UTC
The proposed change to solve this issue is here:

http://gerrit.ovirt.org/10292

The service script will remove the out of date indexes before starting the engine.

Comment 5 Juan Hernández 2012-12-21 10:36:51 UTC
Note the behaviour of jboss modules has been changed upstream to skip out of date indexes as well:

https://issues.jboss.org/browse/MODULES-105

According to the comments in that bug the external indexes will probably disappear in the future.

Comment 7 Alon Bar-Lev 2013-01-07 10:38:09 UTC
Hi,

This issue is rhel specific.

On fedora:
# ls -lad /usr/share/jboss-as/modules/
drwxr-xr-x. 12 root root 4096 Aug  2 10:09 /usr/share/jboss-as/modules/

On rhel:
# ls -lad /usr/share/jbossas/modules/
drwxr-xr-x. 13 jboss jboss 4096 Jun 19  2012 /usr/share/jbossas/modules/

jboss service does not run under root, so on fedora it won't create index files while at rhel it will.

Had jboss given group write permission, the solution would have been to add ovirt-engine user to the jboss group.

I think this is actually a jboss packaging bug, which should have removed the indexes when package is updated.

The workaround proposed in comment#5 is acceptable for downstream only, but I suggest opening two issues for jboss:

1. Writable jboss:jboss directory should have g+w.
2. Post install action of jboss should cleanup any cache created.

Thanks,
Alon

Comment 8 Juan Hernández 2013-01-08 13:05:41 UTC
The jboss process runs with the identity selected by the user, it is just a matter of running /usr/share/jboss-as/bin/standalone.sh directly. That identity can well be root. So this issue is less likely in Fedora, but it can happen as well.

Comment 9 Alon Bar-Lev 2013-01-08 13:12:06 UTC
(In reply to comment #8)
> The jboss process runs with the identity selected by the user, it is just a
> matter of running /usr/share/jboss-as/bin/standalone.sh directly. That
> identity can well be root. So this issue is less likely in Fedora, but it
> can happen as well.

Right. There are many deamons out there that if executed as root, then there are issues when executed as the standard user.

In the case you mention, after running the standalone.sh as root, trying to start the standard jboss daemon results with a failure. I am not sure we should solve fundamental wrong design issues of jboss.

In rhel it even worse, as running the standalone.sh as root will create indexes owned by root, which is violation of the permissions of the directory structure, right?

Comment 10 Juan Hernández 2013-01-09 13:42:19 UTC
The patch proposed in comment #3 has been modified to create a temporary copy of the modules directory without the indexes and it has been merged upstream:

http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=a07ba1df50e34e2d76e824e9b1d054249e696299

Comment 11 Juan Hernández 2013-01-09 13:50:28 UTC
Also note that removing the indexes that are older than the .jar files is not enough, as the modification date of the files is the date the .rpm package was created, and the .index files can be older than that but still out of date. That is why we need to either remove *all* the indexes or, as we do in the patch, create a temporary copy of the modules directory without them.

Comment 14 Juan Hernández 2013-05-20 10:39:35 UTC
To verify this start the engine and make sure that the engine is using the /var/tmp/ovirt-engine/modules directory and that that directory is created every time that the engine starts:

1. With the engine stopped make sure that the /var/tmp/ovirt-engine/modules directory does *not* exist.

2. Start the engine and make sure that it is running with the option "-mp /usr/share/ovirt-engine/modules:/var/tmp/ovirt-engine/modules" (you can use "ps -u ovirt -f" to check that).

If the engine is using /usr/share/jbossas/modules instead of /var/tmp/ovirt-engine/modules then it isn't working.

Comment 15 Martin Pavlik 2013-05-20 14:28:08 UTC
works in SF17

[root@mp-rhevm32 ~]# service ovirt-engine stop
Stopping engine-service:                                    [  OK  ]

[root@mp-rhevm32 ~]# ls -lad /var/tmp/ovirt-engine/modules
ls: cannot access /var/tmp/ovirt-engine/modules: No such file or directory

[root@mp-rhevm32 ~]# service ovirt-engine start
Starting engine-service:                                    [  OK  ]

ps -u ovirt -fc | less

UID        PID  PPID  C STIME TTY          TIME CMD
ovirt    22810     1 99 16:21 ?        00:00:10 engine-service -server -XX:+TieredCompilation -Xms1g -Xmx1g -XX:PermSize=256m -XX:MaxPermSize=256m -Djava.net.preferIPv4Stack=true -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true -Djava.util.logging.manager=org.jboss.logmanager -Dlogging.configuration=file:///var/tmp/ovirt-engine/engine-service-logging.properties -Dorg.jboss.resolver.warning=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djboss.server.default.config=engine-service -Djboss.home.dir=/usr/share/jbossas -Djboss.server.base.dir=/usr/share/ovirt-engine -Djboss.server.config.dir=/var/tmp/ovirt-engine -Djboss.server.data.dir=/var/lib/ovirt-engine -Djboss.server.log.dir=/var/log/ovirt-engine -Djboss.server.temp.dir=/var/tmp/ovirt-engine -Djboss.controller.temp.dir=/var/tmp/ovirt-engine -jar /usr/share/jbossas/jboss-modules.jar -mp /usr/share/ovirt-engine/modules:/var/tmp/ovirt-engine/modules -jaxpmodule javax.xml.jaxp-provider org.jboss.as.standalone -c engine-service.xml

[root@mp-rhevm32 ~]# ls /var/tmp/ovirt-engine/modules/
system

Comment 16 Itamar Heim 2013-06-11 08:37:33 UTC
3.2 has been released

Comment 17 Itamar Heim 2013-06-11 08:37:33 UTC
3.2 has been released

Comment 18 Itamar Heim 2013-06-11 08:37:44 UTC
3.2 has been released

Comment 19 Itamar Heim 2013-06-11 08:45:41 UTC
3.2 has been released


Note You need to log in before you can comment on or make changes to this bug.