Bug 828140 - Log in screen available but user is not able to log in for several minutes after installation
Summary: Log in screen available but user is not able to log in for several minutes af...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Installer
Version: 4.4
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: JON 3.1.0
Assignee: Jay Shaughnessy
QA Contact: Filip Brychta
URL:
Whiteboard:
Depends On: 826503
Blocks: jon310-sprint11, rhq44-sprint11
TreeView+ depends on / blocked
 
Reported: 2012-06-04 10:52 UTC by Charles Crouch
Modified: 2015-02-01 23:28 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 826503
Environment:
Last Closed: 2013-09-03 15:19:45 UTC
Embargoed:


Attachments (Terms of Use)

Description Charles Crouch 2012-06-04 10:52:56 UTC
+++ This bug was initially created as a clone of Bug #826503 +++

Description of problem:
Log in screen is available before the installation process is complete. Attempts to log in end with no result, no error message is displayed. This lasted for several minutes until log in was successful.

Version-Release number of selected component (if applicable):
Version: 3.1.0.CR1
Build Number: 4bc4270:1b85993

How reproducible:
Always

Steps to Reproduce:
1.follow JON installation process (include several plugins), choose overwrite data (if there is some previous installation) 
2.try to log in as soon as log in page is avaiable
  
Actual results:
user stays on log in page with no result, no error message is shown
following error is thrown to server log: 
2012-05-30 12:31:12,336 ERROR [org.apache.catalina.connector.CoyoteAdapter] An exception or error occurred in the container during the request processing
java.lang.NullPointerException
        at org.apache.catalina.connector.CoyoteAdapter.parseSessionCookiesId(CoyoteAdapter.java:507)
        at org.apache.catalina.connector.CoyoteAdapter.postParseRequest(CoyoteAdapter.java:449)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:239)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:679)


Expected results:
user is logged in

Additional info:
I could see that installation was still in progress (messages like 2012-05-30 12:31:20,693 INFO  [org.rhq.enterprise.server.resource.metadata.ResourceMetadataManagerBean] Persisting new ResourceType [JBossAS7:Param(id=0)]...
) in server log. So the problem is that a log in screen is available before installation process is actually finished.

--- Additional comment from lzoubek on 2012-06-01 13:07:56 EDT ---

Also in this time, when you attempt to download agent, you get 200 from server, but response-length is 0.

--- Additional comment from jshaughn on 2012-06-01 17:28:47 EDT ---


This is due to the problem described in bug 684459. The problem is that when we switch from the installer to portal war and change the root context we hit that problem.  That causes a 200 to be returned (why why why?) even though the server hits an exception.  Which makes the installer think that startupServlet is responding.

--- Additional comment from jshaughn on 2012-06-02 17:24:51 EDT ---


abridged IRC Chat re: the approach to solve:

(4:41:55 PM) ips: jshaughn: i think part of the problem is StartupServlet is in portal-war. at this point it really should be in coregui instead
(4:42:29 PM) jshaughn: yeah
(4:43:05 PM) jshaughn: there are actually two problems here, though
(4:43:17 PM) ips: the other way to do it might be for the installer to ping the startupservlet w/ GET requests in a loop
(4:43:41 PM) jshaughn: the first is that the jsp is saying we're done. the second is that coregui actually throws up the login box before we can actually log in
(4:43:41 PM) ips: and only display the Get Started! link after it gets a 200 response
(4:43:53 PM) jshaughn: yeah
(4:44:02 PM) jshaughn: the fix is to delay installer
(4:44:44 PM) jshaughn: I'm looking at header.jsp
(4:46:09 PM) ips: interestingly, here's the mapping for the startupservlet in portal-war's web.xml:
(4:46:11 PM) ips:     <!-- provides startup progress information (mainly for the installer) -->
(4:46:12 PM) ips:     <servlet-mapping>
(4:46:12 PM) ips:       <servlet-name>initializer</servlet-name>
(4:46:12 PM) ips:       <url-pattern>/startupstatus</url-pattern>
(4:46:12 PM) ips:    </servlet-mapping>
(4:46:39 PM) ips: is the installer actually already pinging /startupstatus as i suggested above?
(4:46:43 PM) jshaughn: yeah, it's doing it
(4:46:46 PM) jshaughn: see header.jsp
(4:46:53 PM) ips: if so, it's not working right  :)
(4:47:05 PM) jshaughn: it's because of that coyote NPE
(4:47:17 PM) jshaughn: thats messing things up
(4:47:37 PM) jshaughn: As soon as we hit that the installer says we're done
(4:47:54 PM) jshaughn: it must result from our pinging
(4:49:43 PM) ips: hmm
(4:51:04 PM) mfoley: ah ... the coyote NPE is from the startup servelt ... i did not know that 
(4:52:18 PM) jshaughn: I'm not sure how it actually manages to pass if an exception is thrown
(4:52:30 PM) jshaughn: I'd have more expectd it to stay in the not ready state
(4:53:18 PM) ips: yeah, me too. you'd probably have to attach w/ a javascript debugger and step through header.jsp to figure it out
(4:54:04 PM) ips: did you google that coyote exception? i vaguely recall we may have traced that to being a jbossweb bug
(4:54:16 PM) ips: did you ever find the original bz for this?
(4:56:43 PM) ips: yeah, it is ugly in the server log an potentially alarming to users
(4:56:55 PM) jshaughn: yeah, it's bad looking
(4:57:04 PM) ips: would be nice to clear it up if there's a reasonably easy way
(4:57:05 PM) jshaughn: I think we have a release not about it
(4:57:36 PM) jshaughn: ips: https://bugzilla.redhat.com/show_bug.cgi?id=684459
(4:58:24 PM) jshaughn: it's still NEW
(5:00:09 PM) jshaughn: don't see much help in https://issues.jboss.org/browse/JBWEB-188
(5:00:49 PM) jshaughn: This will likely not go away any time soon
(5:10:14 PM) ips: yep. JBWEB-188 is almost 2 years old, so i'm not going to hold my breath for a fix
(5:10:34 PM) jshaughn: we'd likely have to go to AS7
(5:10:49 PM) ips: and unfortunately, NPE's in the jon server log make us look bad 
(5:14:30 PM) jshaughn: it seems to come bak with a 200
(5:14:39 PM) jshaughn: which is bogus
(5:14:51 PM) ips: if anything it would just send a 404 or some other error response to the client
(5:15:04 PM) ips: yeah if it's sending a 200 that's really wacky
(5:15:27 PM) jshaughn: I'm just trying to understand how it gets the code to think we're done
(5:15:36 PM) ips: but it does seem that only  a 200 or a 401 would get it to show the link
(5:20:59 PM) jshaughn: ips, looking at that jbossweb bug, it actually states that a 200 is returned :(
(5:25:00 PM) ips: jshaughn: argh
(5:27:48 PM) ips: but coyote exception aside, i think we could fix this by returning some other status code other than 200 in startupservlet and then checking for that code in header.jsp
(5:28:44 PM) ips: e.g. 204
(5:29:18 PM) jshaughn: hmm, that's a good idea
(5:29:27 PM) ips: or even a funky non standard code like 299 :)
(5:30:01 PM) jshaughn: How about 188 in honor of jbossweb-188
(5:30:11 PM) jshaughn: that may not be valid
(5:30:20 PM) jshaughn: I don't know my response codes
(5:31:39 PM) ips: hehee, i like it
(5:31:59 PM) jshaughn: ips, so would that be a change it StartupServlet:109?
(5:32:23 PM) ips: yeah and in header.jsp
(5:32:31 PM) jshaughn: instead of SC_OK
(5:32:37 PM) jshaughn: right, and on the receiving end
(5:34:50 PM) ips: jshaughn: we also may want to increase the delay in header.jsp a bit to eg 7 or 10s
(5:35:03 PM) ips: that may at least reduce the # of Coyote NPE's
(5:35:47 PM) ips: then maybe post 3.1 we can look into patching jbossweb
(5:37:39 PM) ips: we know it's going to take like 10 mins to fully deploy thanks to the as7 plugin, so we could probably even use a 30s or 60s delay
(5:37:57 PM) jshaughn: ips, true
(5:38:07 PM) jshaughn: no real reason to rush
(5:38:13 PM) jshaughn: although upgrades may go faster
(5:38:33 PM) jshaughn: although, not everyone uses AS7 plugin
(5:38:38 PM) ips: we can also change it to not do the initial ping until after the delay has transpired
(5:39:00 PM) ips: it looks like right now it will send the initial ping as soon as the page has loaded

--- Additional comment from jshaughn on 2012-06-02 17:25:40 EDT ---


master commit 8de7734a94adcec32de43ca436a12c7edecc48d9

This is due to the fact that the problem in bug 684450 is
still unresolved, and will likely remain that way until we
move to a new version of AS as the underlying RHQ server
platform.  When we switch contexts between the installer and
portal war we start having the problem described in
https://issues.jboss.org/browse/JBWEB-188.  The fact that it
returns a 200 response code even though the request fails is
what fools the installer into thinking that the server is
ready. So:
- Change Startup servlet to pass back an unusual response code
  (288)on success
- Change the installer's header.jsp to look for 288 as opposed
  to 200.
- Change StartupServlet the polling time from 5 to 15 seconds to
  reduce the number of coyote exceptions.
- Update the installer message to let users know that it could be
  several minutes in this state.

Comment 1 Mike Foley 2012-06-04 15:44:19 UTC
triaged into JON 3.1 per BZ triage ccrouch, asantos, loleary, mfoley.

QE ..this will be in the next release candidate.

Comment 2 Jay Shaughnessy 2012-06-04 15:59:55 UTC
As it turns out this should only affect JON and not RHQ (at least not at this time: RHQ 4.4).  The reason is that JON uses a later version of jboss-web, which has the problem described above.  RHQ does not.  This is due to bug 773739 and bug 773737.

So, QA for Master/RHQ should simply confirm that the installer still works as expected.

Comment 3 Jay Shaughnessy 2012-06-04 16:19:48 UTC
Git cherry-pick of master commit: 8de7734a94adcec32de43ca436a12c7edecc48d9

  Release-branch commit: 7207583e459c6a76145f96bccf3642d962cccfc2

This is due to the fact that the problem in bug 684450 is
still unresolved, and will likely remain that way until we
move to a new version of AS as the underlying RHQ server
platform.  When we switch contexts between the installer and
portal war we start having the problem described in
https://issues.jboss.org/browse/JBWEB-188.  The fact that it
returns a 200 response code even though the request fails is
what fools the installer into thinking that the server is
ready. So:
- Change Startup servlet to pass back an unusual response code
  (288)on success
- Change the installer's header.jsp to look for 288 as opposed
  to 200.
- Change StartupServlet the polling time from 5 to 15 seconds to
  reduce the number of coyote exceptions.
- Update the installer message to let users know that it could be
  several minutes in this state.

Comment 4 Filip Brychta 2012-06-05 08:03:52 UTC
Verified on 3.1.0.CR2. Main problem fixed (login screen isn't displayed before installation is finished now). 'Also in this time, when you attempt to download agent, you get 200 from server, but response-length is 0.' not fixed

Comment 5 Heiko W. Rupp 2013-09-03 15:19:45 UTC
Bulk closing of old issues in VERIFIED state.


Note You need to log in before you can comment on or make changes to this bug.