Bug 689690

Summary: XInclude sometimes fails even though file exists
Product: [Community] Publican Reporter: Dana Mison <dmison>
Component: publicanAssignee: Jeff Fearn <jfearn>
Status: CLOSED NOTABUG QA Contact: Ruediger Landmann <rlandman+disabled>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.5CC: mmcallis, publican-list
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-22 18:29:33 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Dana Mison 2011-03-22 02:26:06 EDT
Description of problem:
Using <xi:include> to include a text file will sometimes fail with the same error that it would produce if the file was not found.  However the file is in the correct place and it seems that the content of the text file prevents it from being parsed.  The only file I have encountered this problem with is the LGPL 2.1 text file from the gnu.org site, http://www.gnu.org/licenses/old-licenses/lgpl-2.1.txt

Version-Release number of selected component (if applicable):

How reproducible:
Every time.

Steps to Reproduce:
1. Create a Publican book
2. Download http://www.gnu.org/licenses/old-licenses/lgpl-2.1.txt and add it to the extras directory
3. add the following line to the book in a suitable place:     
<programlisting><xi:include href="extras/lgpl-2.1.txt" parse="text" xmlns:xi="http://www.w3.org/2001/XInclude" /></programlisting>
4. Attempt to build the book
Actual results:
Beginning work on en-US
FATAL ERROR: XInclude:1604 in Chapter.xml on line 37: could not load extras/lgpl-2.1.txt, and no fallback was found
 at /usr/bin/publican line 672

Expected results:
Book builds with the contents of the file included as per a program listing

Additional info:
There is an odd character in this file that seems to prevent it from being parsed correctly.  It appears as a ^L when viewed within VI.  It occurs on lines all by itself, at a semi-regular rate within this file, around every every 50th lines on average.  Removing all instances of this character results in the book building as expected.

Extract from file:

that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.
  Finally, software patents pose a constant threat to the existence of
any free program.  We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder.  Therefore, we insist that

Also running dos2unix on this file produces this error:
[testbook]$ dos2unix en-US/extras/lgpl-2.1.txt
dos2unix: Skipping binary file en-US/extras/lgpl-2.1.txt
Comment 1 Dana Mison 2011-03-22 02:32:56 EDT
Just tested the following, they don't have the character and build ok.

Comment 2 Jeff Fearn 2011-03-22 02:45:33 EDT
^L is a printing control character for form feed, it's not valid in a text file. You will need to remove them from the file.

You could try running 'col -b' over the file.
Comment 3 Dana Mison 2011-03-22 03:32:08 EDT
Cool, that works.

It would be nice to Publican to report a slightly different error message for files that cannot be parsed because they are invalid vs files that cannot be found.  I don't know what data you get back from the parser when it fails.
Comment 4 Jeff Fearn 2011-03-22 18:29:33 EDT
Those error messages come from libxml2, you should open a bug against that component if you want it changed.