Bug 974640 - RFE: Permalinks and human-readable URLs [NEEDINFO]
RFE: Permalinks and human-readable URLs
Status: CLOSED CURRENTRELEASE
Product: Publican
Classification: Community
Component: publican (Show other bugs)
future
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Jeff Fearn
Ruediger Landmann
: FutureFeature, Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-14 13:16 EDT by Mark Caron
Modified: 2014-04-08 21:55 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-04-08 21:55:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
r.landmann: needinfo? (mcaron)


Attachments (Terms of Use)

  None (edit)
Description Mark Caron 2013-06-14 13:16:15 EDT
Description of problem:

It seems that we have and will continue to have many broken links not just in the Customer Portal, but internet-wide.

While I do love the human-readability of the current metadata-based structure (and so do search engines), it doesn't provide any permanence for documentation. This lack of a permalink causes issues in the Customer Portal, when referencing particular 'books'. It also affects many websites outside of Red Hat, e.g. Stack Overflow, resulting in 404 errors when users provide links to our own documentation.

I also imagine that this metadata-based structure poses problems for retiring documentation as well.

I believe this is why many Content Management Systems and Blog Systems provide URL structures in both ways. Pages can have human-readable URLs *and* permalinks that are based on IDs. When the human-readable URL changes, the ID will never change. Thus, allowing permanent redirects (on the server) to point to moved or retired documents. Both WordPress and Drupal do this.

It would be worth looking into a way that we can have both structures running simultaneously. Relying on the IDs for permanence and the prettier metadata URLs for SEO/robots/humans.

This will most likely require a server-side approach rather than statically generated HTML.
Comment 1 Joshua Wulf 2013-06-28 00:01:13 EDT
A suggestion:

It could be done with static html like this:

publican --product-rename <new_name>

This generates two versions of the book, one with the new product name, and one with the old product name where the html output is a set of pages that redirect to their equivalent in the new book.

You publish both, and the redirect book handles the redirects for you.

In that case you'd hide the redirect book in the website TOC.

It can all be done from one place in that case. Separating it into a server-side operation and a authoring-side operation will put its execution across two groups, which will have latency and friction.
Comment 2 Jeff Fearn 2013-07-17 07:28:11 EDT
The 'static website' constraint is imposed on publican's design. You will need to discuss that with ECS management directly if you want that changed. Such a discussion is best held outside of Bugzilla.

"Faking" a web service is not scalable or maintainable, we are defiantly not going to try that.
Comment 3 Jeff Fearn 2013-07-22 23:47:45 EDT
Rudi has asked for this to be reopened and more information supplied.
Comment 4 Jeff Fearn 2013-07-23 02:28:59 EDT
Hi, I've sent Rudi a long email with a bunch of stuff that is IMO best left out of Bugzilla. He will forward that on as needed in other discussions.

A few of the options depend on how the Customer Portal exposes redirection, does the CP have an API for this? If not are there suggested methods on doing this in the CP?
Comment 5 Chris Bredesen 2013-07-23 13:31:26 EDT
The Customer Portal, as you may know, is not just one thing. Individual apps (case mgt, kbase, subscriptions, docs, etc) should keep their own documents in order and that includes handling redirects to renamed/retired/moved content. There is no central API for this mainly because it should be handled as close to the source content as it can be. What woudl such an API do?
Comment 6 Chris Bredesen 2013-07-23 13:40:08 EDT
Josh's proposal in comment 1 is spot on IMO.
Comment 7 Jeff Fearn 2013-07-23 18:44:07 EDT
(In reply to Chris Bredesen from comment #5)
> The Customer Portal, as you may know, is not just one thing. Individual apps
> (case mgt, kbase, subscriptions, docs, etc) should keep their own documents
> in order and that includes handling redirects to renamed/retired/moved
> content. There is no central API for this mainly because it should be
> handled as close to the source content as it can be. What woudl such an API
> do?

The publican website is not an app, it is static content. The discussion here is how we bridge the gap. Is there a way to update the Apache redirect rules? Is it using Apache at all, or is there some other system being used at that level?
Comment 8 Chris Bredesen 2013-07-23 18:53:19 EDT
The issue isn't whether we can update the rules. We live behind a global (*.redhat.com) unified proxy layer provided by an F5 appliance. That level in the stack is not an appropriate place to handle detailed application-level concerns like moved book content that happens rather frequently. This will need to be done elsewhere. Josh's suggestion is the best one IMO with .htaccess files in the Publican Apache instance also viable.
Comment 9 Jeff Fearn 2013-07-23 19:17:22 EDT
(In reply to Chris Bredesen from comment #8)
> The issue isn't whether we can update the rules. We live behind a global
> (*.redhat.com) unified proxy layer provided by an F5 appliance. That level
> in the stack is not an appropriate place to handle detailed
> application-level concerns like moved book content that happens rather
> frequently. This will need to be done elsewhere. Josh's suggestion is the
> best one IMO with .htaccess files in the Publican Apache instance also
> viable.

"the Publican Apache instance" a knowledge gap is filled.

Josh did not mention .htaccess, to me he seemed to be talking about making static HTML files that have a redirect meta tag. There is no need for a separate package/payload with .htaccess files.

.htaccess files would be a reasonable approach for the subset of URL changes that can be managed at the directory/page level. I'm not sure it can handle the subset of URL changes caused by changes in chunking, renaming of internal links, or restructuring of content.

e.g. 

http://www.example.com/test.html#section5 becomes  http://www.example.com/section5.html or vice-verse

http://www.example.com/test.html#section5 becomes  http://www.example.com/index.html#section5_The_Life_Of_Brian

http://www.example.com/test.html#section5 becomes  http://www.example.com/index.html#section6 

To the googleatron!
Comment 10 Jeff Fearn 2013-07-23 19:26:48 EDT
http://httpd.apache.org/docs/2.0/rewrite/rewrite_guide.html#redirectanchors


Redirecting Anchors

Description:

    By default, redirecting to an HTML anchor doesn't work, because mod_rewrite escapes the # character, turning it into %23. This, in turn, breaks the redirection.
Solution:

    Use the [NE] flag on the RewriteRule. NE stands for No Escape.

Gold.

Note You need to log in before you can comment on or make changes to this bug.