Bug 826503 - Log in screen available but user is not able to log in for several minutes after installation
Log in screen available but user is not able to log in for several minutes af...
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Installer (Show other bugs)
4.4
Unspecified Unspecified
urgent Severity urgent (vote)
: ---
: RHQ 4.5.0
Assigned To: Jay Shaughnessy
Mike Foley
:
Depends On:
Blocks: 828140
  Show dependency treegraph
 
Reported: 2012-05-30 07:25 EDT by Filip Brychta
Modified: 2013-08-31 05:55 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 828140 (view as bug list)
Environment:
Last Closed: 2013-08-31 05:55:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Filip Brychta 2012-05-30 07:25:03 EDT
Description of problem:
Log in screen is available before the installation process is complete. Attempts to log in end with no result, no error message is displayed. This lasted for several minutes until log in was successful.

Version-Release number of selected component (if applicable):
Version: 3.1.0.CR1
Build Number: 4bc4270:1b85993

How reproducible:
Always

Steps to Reproduce:
1.follow JON installation process (include several plugins), choose overwrite data (if there is some previous installation) 
2.try to log in as soon as log in page is avaiable
  
Actual results:
user stays on log in page with no result, no error message is shown
following error is thrown to server log: 
2012-05-30 12:31:12,336 ERROR [org.apache.catalina.connector.CoyoteAdapter] An exception or error occurred in the container during the request processing
java.lang.NullPointerException
        at org.apache.catalina.connector.CoyoteAdapter.parseSessionCookiesId(CoyoteAdapter.java:507)
        at org.apache.catalina.connector.CoyoteAdapter.postParseRequest(CoyoteAdapter.java:449)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:239)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:679)


Expected results:
user is logged in

Additional info:
I could see that installation was still in progress (messages like 2012-05-30 12:31:20,693 INFO  [org.rhq.enterprise.server.resource.metadata.ResourceMetadataManagerBean] Persisting new ResourceType [JBossAS7:Param(id=0)]...
) in server log. So the problem is that a log in screen is available before installation process is actually finished.
Comment 1 Libor Zoubek 2012-06-01 13:07:56 EDT
Also in this time, when you attempt to download agent, you get 200 from server, but response-length is 0.
Comment 2 Jay Shaughnessy 2012-06-01 17:28:47 EDT
This is due to the problem described in bug 684459. The problem is that when we switch from the installer to portal war and change the root context we hit that problem.  That causes a 200 to be returned (why why why?) even though the server hits an exception.  Which makes the installer think that startupServlet is responding.
Comment 3 Jay Shaughnessy 2012-06-02 17:24:51 EDT
abridged IRC Chat re: the approach to solve:

(4:41:55 PM) ips: jshaughn: i think part of the problem is StartupServlet is in portal-war. at this point it really should be in coregui instead
(4:42:29 PM) jshaughn: yeah
(4:43:05 PM) jshaughn: there are actually two problems here, though
(4:43:17 PM) ips: the other way to do it might be for the installer to ping the startupservlet w/ GET requests in a loop
(4:43:41 PM) jshaughn: the first is that the jsp is saying we're done. the second is that coregui actually throws up the login box before we can actually log in
(4:43:41 PM) ips: and only display the Get Started! link after it gets a 200 response
(4:43:53 PM) jshaughn: yeah
(4:44:02 PM) jshaughn: the fix is to delay installer
(4:44:44 PM) jshaughn: I'm looking at header.jsp
(4:46:09 PM) ips: interestingly, here's the mapping for the startupservlet in portal-war's web.xml:
(4:46:11 PM) ips:     <!-- provides startup progress information (mainly for the installer) -->
(4:46:12 PM) ips:     <servlet-mapping>
(4:46:12 PM) ips:       <servlet-name>initializer</servlet-name>
(4:46:12 PM) ips:       <url-pattern>/startupstatus</url-pattern>
(4:46:12 PM) ips:    </servlet-mapping>
(4:46:39 PM) ips: is the installer actually already pinging /startupstatus as i suggested above?
(4:46:43 PM) jshaughn: yeah, it's doing it
(4:46:46 PM) jshaughn: see header.jsp
(4:46:53 PM) ips: if so, it's not working right  :)
(4:47:05 PM) jshaughn: it's because of that coyote NPE
(4:47:17 PM) jshaughn: thats messing things up
(4:47:37 PM) jshaughn: As soon as we hit that the installer says we're done
(4:47:54 PM) jshaughn: it must result from our pinging
(4:49:43 PM) ips: hmm
(4:51:04 PM) mfoley: ah ... the coyote NPE is from the startup servelt ... i did not know that 
(4:52:18 PM) jshaughn: I'm not sure how it actually manages to pass if an exception is thrown
(4:52:30 PM) jshaughn: I'd have more expectd it to stay in the not ready state
(4:53:18 PM) ips: yeah, me too. you'd probably have to attach w/ a javascript debugger and step through header.jsp to figure it out
(4:54:04 PM) ips: did you google that coyote exception? i vaguely recall we may have traced that to being a jbossweb bug
(4:54:16 PM) ips: did you ever find the original bz for this?
(4:56:43 PM) ips: yeah, it is ugly in the server log an potentially alarming to users
(4:56:55 PM) jshaughn: yeah, it's bad looking
(4:57:04 PM) ips: would be nice to clear it up if there's a reasonably easy way
(4:57:05 PM) jshaughn: I think we have a release not about it
(4:57:36 PM) jshaughn: ips: https://bugzilla.redhat.com/show_bug.cgi?id=684459
(4:58:24 PM) jshaughn: it's still NEW
(5:00:09 PM) jshaughn: don't see much help in https://issues.jboss.org/browse/JBWEB-188
(5:00:49 PM) jshaughn: This will likely not go away any time soon
(5:10:14 PM) ips: yep. JBWEB-188 is almost 2 years old, so i'm not going to hold my breath for a fix
(5:10:34 PM) jshaughn: we'd likely have to go to AS7
(5:10:49 PM) ips: and unfortunately, NPE's in the jon server log make us look bad 
(5:14:30 PM) jshaughn: it seems to come bak with a 200
(5:14:39 PM) jshaughn: which is bogus
(5:14:51 PM) ips: if anything it would just send a 404 or some other error response to the client
(5:15:04 PM) ips: yeah if it's sending a 200 that's really wacky
(5:15:27 PM) jshaughn: I'm just trying to understand how it gets the code to think we're done
(5:15:36 PM) ips: but it does seem that only  a 200 or a 401 would get it to show the link
(5:20:59 PM) jshaughn: ips, looking at that jbossweb bug, it actually states that a 200 is returned :(
(5:25:00 PM) ips: jshaughn: argh
(5:27:48 PM) ips: but coyote exception aside, i think we could fix this by returning some other status code other than 200 in startupservlet and then checking for that code in header.jsp
(5:28:44 PM) ips: e.g. 204
(5:29:18 PM) jshaughn: hmm, that's a good idea
(5:29:27 PM) ips: or even a funky non standard code like 299 :)
(5:30:01 PM) jshaughn: How about 188 in honor of jbossweb-188
(5:30:11 PM) jshaughn: that may not be valid
(5:30:20 PM) jshaughn: I don't know my response codes
(5:31:39 PM) ips: hehee, i like it
(5:31:59 PM) jshaughn: ips, so would that be a change it StartupServlet:109?
(5:32:23 PM) ips: yeah and in header.jsp
(5:32:31 PM) jshaughn: instead of SC_OK
(5:32:37 PM) jshaughn: right, and on the receiving end
(5:34:50 PM) ips: jshaughn: we also may want to increase the delay in header.jsp a bit to eg 7 or 10s
(5:35:03 PM) ips: that may at least reduce the # of Coyote NPE's
(5:35:47 PM) ips: then maybe post 3.1 we can look into patching jbossweb
(5:37:39 PM) ips: we know it's going to take like 10 mins to fully deploy thanks to the as7 plugin, so we could probably even use a 30s or 60s delay
(5:37:57 PM) jshaughn: ips, true
(5:38:07 PM) jshaughn: no real reason to rush
(5:38:13 PM) jshaughn: although upgrades may go faster
(5:38:33 PM) jshaughn: although, not everyone uses AS7 plugin
(5:38:38 PM) ips: we can also change it to not do the initial ping until after the delay has transpired
(5:39:00 PM) ips: it looks like right now it will send the initial ping as soon as the page has loaded
Comment 4 Jay Shaughnessy 2012-06-02 17:25:40 EDT
master commit 8de7734a94adcec32de43ca436a12c7edecc48d9

This is due to the fact that the problem in bug 684450 is
still unresolved, and will likely remain that way until we
move to a new version of AS as the underlying RHQ server
platform.  When we switch contexts between the installer and
portal war we start having the problem described in
https://issues.jboss.org/browse/JBWEB-188.  The fact that it
returns a 200 response code even though the request fails is
what fools the installer into thinking that the server is
ready. So:
- Change Startup servlet to pass back an unusual response code
  (288)on success
- Change the installer's header.jsp to look for 288 as opposed
  to 200.
- Change StartupServlet the polling time from 5 to 15 seconds to
  reduce the number of coyote exceptions.
- Update the installer message to let users know that it could be
  several minutes in this state.
Comment 5 Jay Shaughnessy 2012-06-04 11:57:54 EDT
As it turns out this should only affect JON and not RHQ (at least not at this time: RHQ 4.4).  The reason is that JON uses a later version of jboss-web, which has the problem described above.  RHQ does not.  This is due to bug 773739 and 773737.

So, QA for Master/RHQ should simply confirm that the installer still works as expected.
Comment 6 Filip Brychta 2013-08-26 05:17:05 EDT
Verified on 
Version: 4.9.0-SNAPSHOT
Build Number: d428376

Note You need to log in before you can comment on or make changes to this bug.