Red Hat Bugzilla – Bug 872672
RFE: Add support for olinks
Last modified: 2013-07-03 19:09:19 EDT
Olinks are a valuable alternative to xrefs in a modular publishing environment since individual modules, stored in XML documents, that link to targets in a different modules will validate at creation time. The olink targets are resolved at generation time and can also produce valid link text in the event that the external taget is resolveable but not generated. (This can cause issues at final publicaiton time if not used with caution, but is very valuable when generating parts of a larger work for review purposes.)
Olinks also enable the ability for "deep linking" accross a collection of works. While the Red Hat doc team does not support "deep linking", there are many writing teams that make use of it. Adding support will be a step that makes Publican a fully featured Docbook publishing chain and increases its viability as a single source publishing tool chain beyind the Red Hat ecosystem.
Fintan bolton has a working solution for adding olink support to Publican 3 that he is ready to submit as a part of work he has done to get DocBook 5 support working.
Has this issue been given any consideration? The Fuse team will be migrating to Publican in the near future and would love to see olink support implemented.
If you are asking about a general policy on this, then my position is the same it's always been, how to you keep it up to date & how do you ensure mass rebuilds in brew/koji work?
If you have answers for those questions then we are happy to review any patches to enable this.
The way the proposed olink feature works, the link database is regenerated every time you build the library. So keeping the *.db files up-to-date is not an issue. In other words, when you execute 'publican build' the build proceeds in two phases:
1. Generate all of the link database files and put them under tmp/en-US/xml/<BookName>/ for *every* book in the library.
2. Generate the selected output formats either for a single book or for every book in the library, depending on the selected option.
I guess another aspect of this question is: what happens when a link gets broken due to the actions of a writer editing a document? Well, those broken olinks are easy to find, because the DocBook XSLT templates spit out a Warning for every broken olink. Moreoever, if you prefer, we could even tweak Publican so that it refuses to finish building a book if it detects any broken olinks.
Also, speaking of 'mass rebuilds in brew' sounds scary. But in practice, you wouldn't (and certainly shouldn't) allow olinks to be made within a vast collection of books. The way the proposed feature works, you must decide in advance which books you want to olink between (you must re-arrange the directory structure appropriately). In this way, you can keep the scale of the olinking manageable I think.
Can the srpms generated this way be individually rebuilt in brew/koji? We require this regardless of any change in the books content.
That is, books must be individual source rpms and if Publican or the brands get updated, we need to be able to regenerate all books, by bumping the revision numbers, from the source rpms in brew/koji.
There should be a parameter to control dieing on broken links, limited to brands. (Check out web_cfg in Publican.pm).
Publican has to supply & support features to a broad audience, this approach seems to enforce a non-scaleabale & somewhat fragile work flow that may be extremely difficult to support for general use, it is somewhat less attractive when the vision of the future involves a lot of flame wars for tieing this kind of functionality to such work flows.
Thanks for the clarification, Jeff! I am just getting to grips with the brew/koji part of the workflow and am not entirely clear what is involved in rebuilding srpms in brew/koji. I'll get back to you as soon as I have figured out the implications of this.
By the way, what is typically packaged into an srpm? Does it include the original source XML?
(In reply to comment #8)
> By the way, what is typically packaged into an srpm? Does it include the
> original source XML?
It's basically the content of $tmp_dir/$lang/xml, which is the sanitized xml.
Ok, I just realised that supporting rebuilds in brew/koji is probably not such a tricky hurdle. A brew rebuild featuring a new Publican version or a new branding would not break existing links (since the linking mechanism is orthogonal to branding and so on). The essential requirement is that it should be possible to build SRPMs in brew/koji and the key to making that possible is by including the complete olink database in each book's SRPM. This should not be very difficult to do, I think.
It remains the case that the only way to break existing links is when a writer uploads a book (or books) which is inconsistent with the other books in the library. So, the workflow should make it difficult for writers to do this (e.g. by refusing to finish building a library with inconsistent links).
I don't yet have the code for packaging books built with olinks, so that's something I'll have to work on.
(In reply to comment #10)
> Ok, I just realised that supporting rebuilds in brew/koji is probably not
> such a tricky hurdle. A brew rebuild featuring a new Publican version or a
> new branding would not break existing links (since the linking mechanism is
> orthogonal to branding and so on). The essential requirement is that it
> should be possible to build SRPMs in brew/koji and the key to making that
> possible is by including the complete olink database in each book's SRPM.
> This should not be very difficult to do, I think.
This covers the easy option, allowing clean rebuilding of books, but it doesn't cover the premium feature which would allow updating the links in a mass rebuild.
i.e. big changes that have happened
1: base URL for site changed, thus all links may need to be regenerated
2: host changed, thus all links may need to be updated
3: product name changed (yes this has happened after GA O_O)
Clearly if similar things happen again it would be good to be able to automate the rebuild without having to go and get writers and translators to regenerate from the source repository.
This is not a blocker, it would just be extra cool if we could do that.
(In reply to comment #10)
> So, the workflow should make it difficult for writers to do
> this (e.g. by refusing to finish building a library with inconsistent links).
How would Brew/Koji verify this consistency?
(In reply to comment #12)
> (In reply to comment #10)
> > So, the workflow should make it difficult for writers to do
> > this (e.g. by refusing to finish building a library with inconsistent links).
> How would Brew/Koji verify this consistency?
When building a specific output format, the underlying DocBook XSL script generates a Warning every time it hits a broken Olink. Publican could be programmed to detect these warnings during the build and croak if they occur.
That's the theory anyway!
Is there an easy way to find out how brew/koji processes publican SRPMs? E.g. is it documented anywhere?
I'm guessing that brew executes the command in the %build section of the RPM spec file to build the documentation. But I'm not so sure how it installs the resulting output to the external (or internal) web. The instructions under the %install section of the SRPM look like they are relevant only for installing the doc on a desktop machine.
(In reply to comment #13)
> When building a specific output format, the underlying DocBook XSL script
> generates a Warning every time it hits a broken Olink. Publican could be
> programmed to detect these warnings during the build and croak if they occur.
Right; but if books are packaged in individual SRPMs, this easily breaks when books are rebuilt from SRPMs that contain out-of-date link databases. There's no way that I can see for Brew/Koji to know whether the database is up-to-date or not, or valid or not.
(In reply to comment #14)
> Is there an easy way to find out how brew/koji processes publican SRPMs?
> E.g. is it documented anywhere?
There's absolutely nothing special about how Publican-generated packages get built in a Koji instance (like Brew) compared to how any other packages get built. The Koji documentation is here: https://fedorahosted.org/koji/wiki if you want to go deeper than the answer below
> I'm guessing that brew executes the command in the %build section of the RPM
> spec file to build the documentation. But I'm not so sure how it installs
> the resulting output to the external (or internal) web. The instructions
> under the %install section of the SRPM look like they are relevant only for
> installing the doc on a desktop machine.
Almost! That's how the RPM package gets built. The instructions in the %install section install the files within a chroot environment *on the builder*. This filesystem is then compressed into the RPM package along with the various scriptlets in the %pre, %post, %preun, and %postun sections (we only have %post and %preun, which add and remove entries from the Publican database on the target system).
When that RPM package is installed on a webserver, rpm on that machine executes any %pre scriptlets (we don't have any), unpacks the files to their correct spot in the local filesystem (in our case, under /var/www/html/docs/), then executes any %post scriptlets (in our case, updating the local Publican database).
Thanks for the explanation Rudi!
In the meantime, I have read up a bit about RPMs. One interesting possibility would be to exploit the capability to generate RPM sub-packages. You could create an SRPM that contains the complete library (that is, the XML source for _all_ of the books in a particular library) and this SRPM could generate binary RPMs (as sub-packages) for each of the books in the library. The binary RPMs thus generated would be completely ordinary, single-book RPMs.
This seems to be feasible. The only question is, would brew/koji take all this in its stride and deploy the resulting binary RPMs correctly? I'm guessing the answer is yes, because it is all based on standard RPM functionality.
In the experimental branch I am working on, I have added a draft implementation of packaging for the Library type. This addresses the concerns about preserving olink integrity in the following ways:
* The 'publican package ...' command generates an SRPM that contains the DocBook XML source for the _whole_ library (i.e. multiple books which can have olinks between them). This ensures that the SRPM contains a consistent set of interlinked books.
* When the SRPM gets built, the olink database is created on the fly, which ensures that the database is up-to-date.
* If any broken olinks are detected while building the SRPM, the build is aborted and the binary RPMs are NOT generated. This provides a failsafe mechanism for preventing broken olinks from getting published.
* For the desktop docs, I implemented it so that all of the books in the library are packaged into a _single_ desktop RPM. Thus, installing and removing the library becomes an atomic operation. When you install the desktop RPM, all olinks are valid, because the entire library is available; and when you remove the RPM, the entire library is cleanly removed.
Does that cover all of the issues around olink integrity?
(In reply to comment #18)
> Does that cover all of the issues around olink integrity?
Bundling an entire library in to a single deliverable does fulfil the rebuild requirements.
It is also unscalable and will cripple QE and Translation efforts for non trivial libraries.