Bug 108722 - Public site does not work on Tomcat 4
Summary: Public site does not work on Tomcat 4
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Enterprise CMS
Classification: Retired
Component: other
Version: nightly
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Vadim Nasardinov
QA Contact: Jon Orris
URL:
Whiteboard:
Depends On: 109211
Blocks: 106597
TreeView+ depends on / blocked
 
Reported: 2003-10-31 15:37 UTC by Jon Orris
Modified: 2005-10-31 22:00 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-11-13 22:17:38 UTC
Embargoed:


Attachments (Terms of Use)
The directory listing that appears instead of the public site (1.78 KB, text/html)
2003-10-31 15:39 UTC, Jon Orris
no flags Details
ugly root folder page (with no index item) (38.77 KB, image/png)
2003-11-05 23:44 UTC, Vadim Nasardinov
no flags Details

Description Jon Orris 2003-10-31 15:37:48 UTC
Description of problem:
@37530/oracle-se/Linux

The public site for content sections does not display on Tomcat 4. Instead, it
brings up a directory listing.

Comment 1 Jon Orris 2003-10-31 15:39:25 UTC
Created attachment 95628 [details]
The directory listing that appears instead of the public site

The directory listing that appears instead of the public site attached.

Comment 2 Richard Li 2003-10-31 16:18:21 UTC
dennis might be more appropriate to fix.

Comment 3 Richard Li 2003-11-05 13:46:23 UTC
also see bug 108581

Comment 4 Vadim Nasardinov 2003-11-05 14:26:26 UTC
This may be remotely related to bug 108581, but doesn't appear to be.

Comment 5 Vadim Nasardinov 2003-11-05 16:25:59 UTC
Jon,

What does  your $CCM_HOME/conf/server.xml look like?



Comment 6 Vadim Nasardinov 2003-11-05 17:43:42 UTC
Jon, I am blocking on bug 109211.  Please take a look.


Comment 7 Jon Orris 2003-11-05 21:28:13 UTC
Ok, so this bug shows up both on Tomcat _and_ resin.
The cause appears to be that if there is no published content, then
there is no index page for the Public Site.

As a fallback, both resin & tomcat display a 'directory listing'. This
feels like a regression, as I seem to recall running into this when
the index page was changed back in Troika development.


Resin:
Directory of /packages/content-section/www/

    * assets
    * admin
    * generate-category-javascript.jsp
    * components 


Tomcat:
Directory Listing For /packages/content-section/www/
Up To /packages/content-section
Filename 	Size 	Last Modified
   admin/ 	  	Wed, 05 Nov 2003 20:46:04 GMT
   assets/ 	  	Wed, 05 Nov 2003 20:46:04 GMT
   components/ 	  	Wed, 05 Nov 2003 20:46:04 GMT
   generate-category-javascript.jsp 	1.4 kb 	Wed, 05 Nov 2003 20:46:04 GMT
 
Apache Tomcat/4.0.6


Comment 8 Vadim Nasardinov 2003-11-05 22:05:58 UTC
With the following loggers enabled,
 log4j.logger.com.arsdigita.web.DispatcherServlet=DEBUG
 log4j.logger.com.arsdigita.web.BaseDispatcher=DEBUG
 log4j.logger.com.arsdigita.cms.ContentSectionServlet=DEBUG

This is what I get, when loading
http://el-vadimo.home:9000/ccm/content/

DispatcherServlet - Servicing request '/ccm/content/'
BaseDispatcher - Dispatching request /ccm/content/ [,/ccm,/content/,null]
BaseDispatcher - Checking if this request needs a trailing slash
BaseDispatcher - The path already ends in '/'
BaseDispatcher - Storing the path elements of the current request as
the original path elements
BaseDispatcher - Using path '/content/' to lookup application
BaseDispatcher - *** Starting application lookup for path '/content/' ***
BaseDispatcher - Is there an application at path '/content/'?
BaseDispatcher - The cache does not have an application at this path;
looking in the database
BaseDispatcher - Application found at path '/content/'; storing it in
the cache
BaseDispatcher - Found application 1021; dispatching to its servlet
BaseDispatcher - Building the target path from the request path
'/content/' and the spec [Ljava.lang.Object;@14feea
BaseDispatcher - Returning target value
'/__ccm__/servlet/content-section/?app-id=1021'
BaseDispatcher - Forwarding by path to target
'/__ccm__/servlet/content-section/?app-id=1021'
BaseDispatcher - The context path is: 
ContentSectionServlet - Resolving item URL 
ContentSectionServlet - using ItemResolver
com.arsdigita.cms.dispatcher.MultilingualItemResolver
ContentSectionServlet - getting item at url 
ContentSectionServlet - Trying to get LIVE item
ContentSectionServlet - Trying to get content item for URL / from cache
ContentSectionServlet - Did not find content item in cache, so trying
to retrieve and cache...
ContentSectionServlet - NOT serving content item
ContentSectionServlet - Forwarding onto file
/packages/content-section/www/
DispatcherServlet - Successfully dispatched to an application

The second-to-last logging statement was generated by


  if (s_log.isDebugEnabled()) {
      s_log.debug("Forwarding onto file " + packageURL);
  }

This was added at version #15

$ p4 annotate \
 //cms/dev/src/com/arsdigita/cms/ContentSectionServlet.java | grep -C1
"Forwarding onto"
15:if (s_log.isDebugEnabled()) {
15:    s_log.debug("Forwarding onto file " + packageURL);
15:}

#15 corresponds to change 37340 (the landing of the test-packaging
branch).

In
//cms/test-packaging/src/com/arsdigita/cms/ContentSectionServlet.java,
the above modification was introduced at #3, which is change 36693.

$ p4 describe -s 36693
Change 36693 by dan@dan-aplaws-rickshaw on 2003/10/03 07:23:17

Fix numerous bugs in the dispatcher code:

    * Incorrectly calculating the cache expiry time on items.
      It was using the old code from ItemDispatcher which bascailly
      just took the lifecycle expiry date, instead of the current
      ItemDispatcher code which does:

      minimum(default expiry, lifecycle expiry).

    * DispatcherChain class provides no way for the caller to
      determine if the request was actually processed by one
      of the dispatchers in the chain. Thus if none of the 
      dispatchers processed it, rather than the user getting
      a 404, they just got a blank page.

    * The FileDispatcher class is not safe for a WAF scenario
      since it uses java.io.File to see if the JSP physically
      exists on disk before dispatching. It also did not handle
      processing of dictory index files (eg visiting /content/admin/
      failed, but /content/admin/index.jsp works).

   These last two problems were solved by removing the use of
   the dispatcher chain altogether & just doing a straight 
   forward of all non-item requests to /packages/content-section/www
   Thus allowing the servlet's builtin file dispatcher to just
   'do the right thing'. 

   NB. CMSDispatcher is no longer needed now that the JSP
   files for www/admin/index.jsp & www/admin/item.jsp correctly
   handle multi-part POST requests - see also p4 36178. 

Affected files ...

...
//cms/test-packaging/src/com/arsdigita/cms/ContentSectionServlet.java#3
edit


Comment 9 Vadim Nasardinov 2003-11-05 23:41:09 UTC
Quoting from 36693,

 >  Thus if none of the dispatchers processed it, rather than the user
 >  getting a 404, they just got a blank page.

So, Jon's recollection is correct.  We used to get a blank page if
there were no published items.

If there is a published item, then the dispatch process for
/ccm/content/ goes like so:

web.DispatcherServlet - Servicing request '/ccm/content/'
...
web.BaseDispatcher
  Forwarding by path to target
'/__ccm__/servlet/content-section/?app-id=1021'
web.BaseDispatcher
  The context path is: 
cms.ContentSectionServlet
  Resolving item URL 
cms.ContentSectionServlet
  using ItemResolver com.arsdigita.cms.dispatcher.MultilingualItemResolver
cms.ContentSectionServlet
  getting item at url 
cms.ContentSectionServlet
  Trying to get LIVE item
cms.ContentSectionServlet
  Trying to get content item for URL / from cache
cms.ContentSectionServlet
  Did not find content item in cache, so trying to retrieve and cache...
cms.ContentSectionServlet
  adding cached entry for url /
cms.ContentSectionServlet
  Sanity check: item.getPath() is 
cms.ContentSectionServlet
  Content Item is not null
cms.ContentSectionServlet
  serving content item: com.arsdigita.cms.Folder;
  [com.arsdigita.cms.Folder:{id=10017}]
cms.ContentSectionServlet
  using ItemResolver com.arsdigita.cms.dispatcher.MultilingualItemResolver
cms.ContentSectionServlet
  setting template context to null
dispatcher.ContentItemDispatcher
  fetching URL for item / with ID 10017
dispatcher.ContentItemDispatcher
  templateURL is /packages/content-section/templates/default/folder.jsp
dispatcher.ContentItemDispatcher
  normal dispatching.
  templateURL=/packages/content-section/templates/default/folder.jsp

I am attaching a screenshot of what the served page looks like.

Not sure what would be the right thing to do when no items are
published.

There is also the question of what to do about the ugly Root folder
page that is served.  (See the screenshot.)

Dan / Jon, any opinions?


Comment 10 Vadim Nasardinov 2003-11-05 23:44:05 UTC
Created attachment 95744 [details]
ugly root folder page (with no index item)

This screenshot shows what the root folder page for /ccm/content/ looks like
when there is a published item.  Compare this to attachment 95628 [details].

Comment 11 Daniel Berrangé 2003-11-06 00:30:02 UTC
What is *ought* to do when requesting /content/ is dispatch to
/packages/content-section/templates/folder.jsp with the root folder
set as the current item in CMSContext. Of course since in this case
there are no items published at all, there will be no live folder to
set in the context. Thus, the only action that makes any sense is to
display a 404 not found page.


Comment 12 Vadim Nasardinov 2003-11-06 01:06:10 UTC
In other words, should we remove "forward onto a file" bit
and dispatch thusly:

final ItemResolver itemResolver = getItemResolver(section);
final ContentItem item = getItem(section, url, itemResolver);

if (item == null) {
    // display a 404    
} else {
    serveItem(sreq, sresp, section, item);
}


Comment 13 Daniel Berrangé 2003-11-06 09:24:34 UTC
Hmm, no, such a blanket change would mean nothing in
'/packages/content-section/www' is ever served. The process it is
currently following is thus:

 * If there is a live item or folder, dispatch to
/packages/content-section/templates/item.jsp (or folder.jsp)
 * Otherwise dispatch to /packages/content-section/www/

Now, since there are no items published at all, there is no live root
folder, and thus a visit to '/' hits the second stage, is sent to the
'www', where Tomcat then generates a directory listing.

We basically need to stop this directory listing in the case where no
items are published, while still leaving the ability to serve stuff
out of 'www' when no item matches.

I can think of two ways to accomplish this:

 a. Add an index.jsp to /packages/content-section/www that merely
contains 'request.setStatus(HttpServletRequest.SC_NOT_FOUND)'. If
there are any live folders, then it is impossible for this to be
dispatched to, so it will only ever be invoked when there is no live
root folder.

 b. Add an extra test into the dispatch process

    * If there is a live item / folder matching current URL dispatch
to /packages/content-section/templates
    * If there is a draft item / folder matching current URL send 404
    * Otherwise send to /packages/content-section/www


Personally I prefer b. since it seems like saner semantics & less of a
hack. On the otherhand a. is really trivial & won't require any where
near the amount of code change as b. 

So if stability of the dispatcher in the short term is important, I'd
go for a), and switch to b) later on (ie after Nov deliverable)


Comment 14 Jon Orris 2003-11-06 15:22:42 UTC
Given our time constraints, I think that stability is a more important
goal. Therefore, I vote for the short term, stable hack (a) for the
current deliverable, and a full fix (b) for final release.

Comment 15 Vadim Nasardinov 2003-11-06 15:30:17 UTC
Sounds reasonable.  My only comment is, it doesn't seem like a good to
send a 404.  The /ccm/content/ URL is linked off of the
/ccm/content-center/ page.  If it serves a 404, people will think
something is broken.  I think I am going to serve a page that says
"nothing has been published yet" or something like that.

Comments/objections?


Comment 16 Daniel Berrangé 2003-11-06 15:34:31 UTC
You can serve content along with a 404, so how about sending a '404' &
in the body of the reponse add HTML saying 'nothing has been published
yet'...

Comment 17 Vadim Nasardinov 2003-11-06 15:37:51 UTC
Doesn't IE discard the response body and show its own generic "page
not found" page, when its get "HTTP/1.x 404 oopsie-daisy"?


Comment 18 Daniel Berrangé 2003-11-06 15:42:53 UTC
I believe it'll follow the same procedure as per its 500 handling, as
detailed by resin:

<!--
   - Unfortunately, Microsoft has added a clever new
   - "feature" to Internet Explorer.  If the text in
   - an error's message is "too small", specifically
   - less than 512 bytes, Internet Explorer returns
   - its own error message.  Yes, you can turn that
   - off, but *surprise* it's pretty tricky to find
   - buried as a switch called "smart error
   - messages"  That means, of course, that many of
   - Resin's error messages are censored by default.
   - And, of course, you'll be shocked to learn that
   - IIS always returns error messages that are long
   - enough to make Internet Explorer happy.  The
   - workaround is pretty simple: pad the error
   - message with a big comment to push it over the
   - five hundred and twelve byte minimum.  Of course,
   - that's exactly what you're reading right now.
   -->


Comment 19 Vadim Nasardinov 2003-11-06 16:49:40 UTC
Fixed with change 37748.  Works on Resin. Haven't tested on Tomcat.



Note You need to log in before you can comment on or make changes to this bug.