Hide Forgot
Description of problem: We are currently facing issues with JCR session handling in our application. The root cause of this issue is that the JCR session handling in the eXo framework does not handle closed sessions correctly, and this exception is thrown randomly: 2016-03-02 06:39:49,961 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default -host]] (http-/0.0.0.0:8443-1) JBWEB000313: Exception processing error page /WEB-INF/pages/error/general-error.xhtml: javax.servlet.ServletException: org.chromattic.api.UndeclaredRepositoryException: javax.jcr.InvalidItemStateException: This kind of operation is forbidden after a session.logout().: This kind of operation is forbidden after a session.logout().: This kind of operation is forbidden after a session.logout(). at org.exoplatform.portal.application.PortalController.onService(PortalController.java:112) [exo.portal.webui.portal-3.6.5.Final-redhat-1-BZ-1066502.jar:3.6.5.Final-redhat-1-BZ-1066502] at org.exoplatform.container.web.AbstractHttpServlet.service(AbstractHttpServlet.java:132) [exo.kernel.container-2.4.7-GA-redhat-1.jar:2.4.7-GA-redhat-1] at javax.servlet.http.HttpServlet.service(HttpServlet.java:847) [jboss-servlet-api_3.0_spec-1.0.2.Final-redhat-1.jar:1.0.2.Final-redhat-1] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:295) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1] at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1] at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:832) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1] at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:622) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1] However, the error only appears on days, when there have been the following NPEs at server startup (we do nightly server restarts). These exceptions could be reproduced by invoking any of the pages of our application during server startup: 2016-03-02 01:01:33,020 ERROR [org.exoplatform.services.organization.idm.PicketLinkIDMOrganizationServiceImpl] (http-/0.0.0.0:8443-1) null: java.lang.NullPointerException at org.exoplatform.services.organization.idm.PicketLinkIDMServiceImpl.getIdentitySession(PicketLinkIDMServiceImpl.java:211) [exo.portal.component.identity-3.6.5.Final-redhat-1.jar:3.6.5.Final-redhat-1] at org.exoplatform.services.organization.idm.PicketLinkIDMOrganizationServiceImpl.startRequest(PicketLinkIDMOrganizationSer viceImpl.java:140) [exo.portal.component.identity-3.6.5.Final-redhat-1.jar:3.6.5.Final-redhat-1] at org.exoplatform.container.component.RequestLifeCycle.doBegin(RequestLifeCycle.java:57) [exo.kernel.container-2.4.7-GA-redhat-1.jar:2.4.7-GA-redhat-1] at org.exoplatform.container.component.RequestLifeCycleStack.begin(RequestLifeCycleStack.java:94) [exo.kernel.container-2.4.7-GA-redhat-1.jar:2.4.7-GA-redhat-1] at org.exoplatform.container.component.RequestLifeCycle.begin(RequestLifeCycle.java:107) [exo.kernel.container-2.4.7-GA-redhat-1.jar:2.4.7-GA-redhat-1] at org.exoplatform.container.component.RequestLifeCycle.begin(RequestLifeCycle.java:141) [exo.kernel.container-2.4.7-GA-redhat-1.jar:2.4.7-GA-redhat-1] at org.exoplatform.web.WebAppController.service(WebAppController.java:335) [exo.portal.component.web.controller-3.6.5.Final-redhat-1.jar:3.6.5.Final-redhat-1] at org.exoplatform.portal.application.PortalController.onService(PortalController.java:110) [exo.portal.webui.portal-3.6.5.Final-redhat-1-BZ-1066502.jar:3.6.5.Final-redhat-1-BZ-1066502] Version-Release number of selected component (if applicable): How reproducible: Intermittently Steps to Reproduce: 1. Deploy sample-portal (*) 2. Start the server 3. Start a script that attempts to access the sample-portal repeatedly, eg: $ while true; do wget -nv -a wget.log "http://localhost:8080/sample-portal/" ; rm index.html*; sleep 1; done * not 100% sure if this is necessary, but I could not reproduce with the default portal only Actual results: Intermittently, the requests result in the outlined NPE Expected results: Requests are either blocked by the PortalCheckInitFilter, or return the requested page once the portal is started. No NPE should happen Additional info: Further debug logging revealed that the PortalController is not fully initialized, but the PortalContainer.isStarted already returns true: The request arrives when the PortalContainer is in "started" state: 2016-03-23 12:53:38,118 DEBUG [org.gatein.portal.init.PortalCheckInitFilter] (http-/0.0.0.0:8181-1) Request 'http://host:8181/sample-portal' - context= sample-portal - isPortalStarted=false 2016-03-23 12:53:38,118 DEBUG [org.gatein.portal.init.PortalCheckInitFilter] (http-/0.0.0.0:8181-1) PortalContainer 'sample-portal' - context= sample-portal - isStarted=true Then the NPE occurs: 2016-03-23 12:53:38,221 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/sample-portal].[portal]] (http-/0.0.0.0:8181-1) JBWEB000236: Servlet.service() for servlet portal threw exception: java.lang.NullPointerException at org.exoplatform.portal.application.PortalRequestContext.getUserPortalConfig(PortalRequestContext.java:268) [exo.portal.webui.portal-3.6.5.Final-redhat-1-BZ-1066502.jar:3.6.5.Final-redhat-1-BZ-1066502] at org.exoplatform.portal.application.PortalRequestHandler.execute(PortalRequestHandler.java:156) [exo.portal.webui.portal-3.6.5.Final-redhat-1-BZ-1066502.jar:3.6.5.Final-redhat-1-BZ-1066502] In my opinion, the problem is that the PortalController goes into state "started" when there is still initialization going on, which is only completed seconds later: 2016-03-23 12:53:41,752 INFO [org.gatein.portal.application.PortalController] (MSC service thread 1-2) The WebAppController has been successfully initialized for the portal 'sample-portal'
Note: The issue is observed in JPP 6.1.1 and 6.2.0.
Analysis: I see PortalCheckInitFilter has 2 problems. 1. PortalCheckInitFilter flips the flag isPortalStarted to true once it confirms ONE PortalContainer is started. So if it confirms "portal" PortalContainer is started, it will allow "sample-portal" requests too. https://github.com/gatein/gatein-portal/blob/3.6.x/component/portal/src/main/java/org/gatein/portal/init/PortalCheckInitFilter.java We can fix it by checking per PortalContainer. But the issue which the customer is hitting is the next one. 2. The gap between "PortalContainer.isStarted" and deployment complete. As you can find in the RootContainer.createPortalContainer(), there is a gap between "pcontainer.start(true);" and "executeInitTasks(pcontainer, PortalContainerPostCreateTask.TYPE);" where PortalController.onInit() is actually executed. https://github.com/exoplatform/kernel/blob/stable/2.4.x/exo.kernel.container/src/main/java/org/exoplatform/container/RootContainer.java#L759-L766 So we need to add (or replace with) another status check.
This issue is resolved by a custom filter which waits for server startup. The following article contains the filter and instructions: * NullPointerExceptions in log files when JPP 6 is accessed before fully started https://access.redhat.com/solutions/497133