Bug 705956 - publican print_unused parses all files in the en-US and subfolders
Summary: publican print_unused parses all files in the en-US and subfolders
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 2.5
Hardware: i386
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jeff Fearn 🐞
QA Contact: Ruediger Landmann
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-19 00:28 UTC by Jared MORGAN
Modified: 2015-08-10 01:22 UTC (History)
4 users (show)

Fixed In Version: 2.6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-26 00:42:30 UTC
Embargoed:


Attachments (Terms of Use)

Description Jared MORGAN 2011-05-19 00:28:54 UTC
Description of problem:

When checking for unused DocBook XML files in a user guide, the publican print_unused command checks all xi:included files, regardless of the attributes set on the xi:include. sub-directories below en-US. This causes problems for non-docbook XML files stored in the /extras directory. print_unused detects these as invalid XML and abends.

Version-Release number of selected component (if applicable):

publican v2.5

How reproducible:

100%

Steps to Reproduce:
You can test this on books from Middleware that contain XML configuration samples, such as the Hibernate Core Reference Guide (this guide only contains one configuration sample so should be easy to test).

1. xi:include files containing XML configuration samples. 
2. Set the parse="text" attribute on the xi:include
2. Ensure the configuration sample uses the .xml extension (not .xml_sample or some other arbitrary file extension name).
3. Execute "publican print_unused" on the document.
  
Actual results:

You get a validation error, and it is difficult to see what the problem is.

Expected results:

The print_unused command ignores the xi:include marked as parse="text" and searches for other excluded DocBook XML files.

Additional info:

Original Email sent out to list. The info above summarises this information better, but including for extra detail.

==========================
**Short Answer**

Don't name your XML code example files using the .xml filename extension. Choose a consistent alternative filename for your XML files, such as .xml_sample.

**Background**

The useful "publican print_unused" command displays any files that are not used in your XML.

It does this by parsing all files with a .xml extension. Depending on your naming convention, this may also include any XML code examples you have xi:included in your documentation.

If you leave your XML code example files with a .xml extension (instead of something like .xml_sample), the command will fail because it is trying to parse XML that does not contain valid markup. This is particularly relevant if your code samples have ellipses in them to show that content has been removed for readability.

publican print_unused currently disregards the parse="text" parameter set on the xi:include.

**How I Discovered This**

Translation discovered some unused files in the branch, and wanted me to remove them so they didn't waste translation effort.

I tried to execute publican print_unused, but it failed with an error message.

Ryan helped me work out what the problem was, as described in $BACKGROUND. The error message is not that descriptive.

I've just spent a full day going back through the EAP 5.1.0 branch for the translators. 
============================

Comment 2 Jeff Fearn 🐞 2011-05-19 09:47:47 UTC
Stopped print_unused from loading xml files parsed as text.

Committed revision 1774.

Comment 3 Jeff Fearn 🐞 2011-07-07 11:08:49 UTC
Back ported to branches/publican-2x

Committed revision 1813.


Note You need to log in before you can comment on or make changes to this bug.