Red Hat Bugzilla – Bug 705956
publican print_unused parses all files in the en-US and subfolders
Last modified: 2015-08-09 21:22:15 EDT
Description of problem:
When checking for unused DocBook XML files in a user guide, the publican print_unused command checks all xi:included files, regardless of the attributes set on the xi:include. sub-directories below en-US. This causes problems for non-docbook XML files stored in the /extras directory. print_unused detects these as invalid XML and abends.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
You can test this on books from Middleware that contain XML configuration samples, such as the Hibernate Core Reference Guide (this guide only contains one configuration sample so should be easy to test).
1. xi:include files containing XML configuration samples.
2. Set the parse="text" attribute on the xi:include
2. Ensure the configuration sample uses the .xml extension (not .xml_sample or some other arbitrary file extension name).
3. Execute "publican print_unused" on the document.
You get a validation error, and it is difficult to see what the problem is.
The print_unused command ignores the xi:include marked as parse="text" and searches for other excluded DocBook XML files.
Original Email sent out to list. The info above summarises this information better, but including for extra detail.
Don't name your XML code example files using the .xml filename extension. Choose a consistent alternative filename for your XML files, such as .xml_sample.
The useful "publican print_unused" command displays any files that are not used in your XML.
It does this by parsing all files with a .xml extension. Depending on your naming convention, this may also include any XML code examples you have xi:included in your documentation.
If you leave your XML code example files with a .xml extension (instead of something like .xml_sample), the command will fail because it is trying to parse XML that does not contain valid markup. This is particularly relevant if your code samples have ellipses in them to show that content has been removed for readability.
publican print_unused currently disregards the parse="text" parameter set on the xi:include.
**How I Discovered This**
Translation discovered some unused files in the branch, and wanted me to remove them so they didn't waste translation effort.
I tried to execute publican print_unused, but it failed with an error message.
Ryan helped me work out what the problem was, as described in $BACKGROUND. The error message is not that descriptive.
I've just spent a full day going back through the EAP 5.1.0 branch for the translators.
Stopped print_unused from loading xml files parsed as text.
Committed revision 1774.
Back ported to branches/publican-2x
Committed revision 1813.