Description of problem: - Cannot open Spice Console to VM in RHEV-M web GUI. 2015-12-21 13:31:49,001 WARN [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/ovirt-engine/services].[pki-resource]] (ajp-/127.0.0.1:8702-13) JBWEB00023 4: Servlet pki-resource is currently unavailable # wget -O rhevm.cer http://rhevm/ca.crt Resolving rhevm... A.B.C.D Connecting to rhevm|A.B.C.D|:80... connected. HTTP request sent, awaiting response... 503 Service Unavailable 2015-12-22 12:06:21 ERROR 503: Service Unavailable. - It's failing because the pki servlet is unavailable. How reproducible: 100% (customer site) Actual results: Error 500 Expected results: Open VM Spice Console Additional info: The servlet is marked as unavailable because it failed initialization. The initial problems is this: -- 2015-12-19 11:34:56,114 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/ovirt-engine/services].[pki-resource]] (ajp-/127.0.0.1:8702-4) JBWEB000235: Allocate exception for servlet pki-resource: java.lang.NullPointerException at org.ovirt.engine.core.common.config.Config.getValue(Config.java:22) [common.jar:] at org.ovirt.engine.core.common.config.Config.getValue(Config.java:18) [common.jar:] at org.ovirt.engine.core.utils.PKIResources$Resource.<clinit>(PKIResources.java:84) [utils.jar:] at org.ovirt.engine.core.services.PKIResourceServlet.<clinit>(PKIResourceServlet.java:26) [classes:] -- <clinit> is a class' static initializer, and if a static initializer throws any exception, the entire class is unusable forever. There is no way to recover from one of those, the JVM marks the class as bad. A NullPointerException in Config.getValue() means that getConfigUtils() returned null, so either Config.setConfigUtils() has not been called or was called with null. This may be a race condition where configuration is being used prior to Config.setConfigUtils() is called. The only setConfigUtils() call outside of test code is called from an @Singleton @Startup EJB. So this looks like a simple race condition between a Servlet and an EJB stating up. The best solution is to fix the design, but a simple solution would be to make the servlet depend on the EJB, for example putting this in PKIResourceServlet: @EJB BackendLocal backend;
something wrong with the deployment perhaps? I suppose http://rhevm/ca.crt should always work. Oved, PKI issues is your area, moving to infra
If that's in deployment then it is integration. Didi, can you take a look. Move back to infra if needed.
Can't see an integration problem. If ca.crt is unreadable, that's noted in server.log, and current log does not mention that.
Hi Germano, your initial analysis was correct, if you try to access PKIResourcesServlet before Backend EJB is initialized, then PKIResourcesServlet is inaccessible until engine instance is restarted. I was able to reproduce it even on latest master using those steps: 1. Start engine service 2. At the same time as step 1. execute following script to access PKIResourcesServlet as soon as available: for i in {1..8192} ; do wget -O rhevm.cer http://localhost/ca.crt ; done 3. Even after engine started up successfully, PKIResourcesServlet is inaccessible The only workaround until proper fix is posted is either "don't access engine until it's properly started" or block HTTP access completely by: 1. Stop Apache service - so nothing can access engine 2. Start ovirt-engine service and wait until it's properly started 3. Start Apache service Thanks Martin
Yedidyah Bar David, Sorry for not responding your questions right away. 1) I suppose it's still failing every single time he tries it. It did fail 100% during troubleshooting. Do you have anything in mind that might also be failing? Customer seems happy using VNC, did not came back. 2) Yes, always the same 3) We were also unable to understand why it suddenly started failing. 4) We checked these permissions, they were the same as in our labs (working). Martin, It was actually James analysis, I just asked him to check this since I don't know much Java. I'm glad you could reproduce this. According to 1075013 step (2) returns before engine is properly started so this is a bit tricky. From what I understand 1075013 will not fix this (unless apache service depends on engine, which I am not sure is a good idea). Cheers, Germano
(In reply to Germano Veit Michel from comment #14) > It was actually James analysis, I just asked him to check this since I don't > know much Java. I'm glad you could reproduce this. According to 1075013 step > (2) returns before engine is properly started so this is a bit tricky. > > From what I understand 1075013 will not fix this (unless apache service > depends on engine, which I am not sure is a good idea). > > Cheers, > Germano Hi, there is no easy/reliable way how to detect if J2EE application finished it's deployment successfully, so I doubt we could fix 1075013. But regarding this bug the fix is not that complex, because it's not only about PKIResourceServlet <-> Backend dependency, but also about improper usage of internal API and that can be fixed easily. Martin
ok in rhevm-3.6.3.2-0.1.el6.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0376.html