Bug 953618 - Non-ASCII characters in image file names get mangled in publish
Summary: Non-ASCII characters in image file names get mangled in publish
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 3.2
Assignee: Jeff Fearn ๐Ÿž
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-18 16:38 UTC by Stephen Gordon
Modified: 2013-08-09 04:49 UTC (History)
4 users (show)

Fixed In Version: 3.2.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-09 04:49:09 UTC
Embargoed:


Attachments (Terms of Use)

Description Stephen Gordon 2013-04-18 16:38:54 UTC
Description of problem:

As part of my on-going quest to build the books from the openstack-manuals project with publican I ran into this warning:

    WARNING: Image missing: tmp/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.png

On inspection of en-US/images in my source I found that there is in fact an image that matches that filename and that is the filename used in the XML source when including the image.

Actual results:

When I poked into the tmp directory though I can see that the post-copy filename is:

    tmp/en-US/xml/images/Login-รข??-OpenStack-Dashboard.png

Obviously this doesn't match the filename the XML refers to and hence the warning appears.

Expected results:

Names of images in the tmp directory should match those from the source, even when they include unicode characters.

Comment 1 Jeff Fearn ๐Ÿž 2013-07-16 00:48:19 UTC
[Users_Guide]$ cp en-US/images/drupal_book.png en-US/images/Login-โ€“-OpenStack-Dashboard.png
[Users_Guide]$ publican build --formats xml 
...
[Users_Guide]$ ls build/en-US/xml/images/L*
build/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.png

Build works for me, but publish does not.

[Users_Guide]$ publican build --formats html --publish
...
[Users_Guide]$ ls publish/en-US/Publican/3.1/html/Users_Guide/images/L*
publish/en-US/Publican/3.1/html/Users_Guide/images/Login-รข??-OpenStack-Dashboard.png

Comment 2 HSS Product Manager 2013-07-16 00:57:35 UTC
HSS-QE has reviewed and declined this request. QE for this bug will be handled by IED.

Comment 3 Jeff Fearn ๐Ÿž 2013-07-16 02:55:37 UTC
$ publican clean
$ publican build --formats html --publish
...
$ ls build/en-US/xml/images/L*; ls build/en-US/html/images/L*; ls publish/en-US/Publican/3.1/html/Users_Guide/images/L*;
build/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.png
build/en-US/html/images/Login-โ€“-OpenStack-Dashboard.png
publish/en-US/Publican/3.1/html/Users_Guide/images/Login-โ€“-OpenStack-Dashboard.png


To ssh://git.fedorahosted.org/git/publican.git
   783564d..d45cab3  HEAD -> devel

Comment 4 Tomas Capek 2013-07-22 13:30:39 UTC
This failed on my attempt:

$ rpm -q publican
publican-3.1.5-0.fc19.t62.noarch

$ publican clean
$ publican build --langs en-US --formats html-single

Setting up en-US
	Processing file tmp/en-US/xml/Common_Content/Conventions.xml -> tmp/en-US/xml/Common_Content/Conventions.xml
	Processing file tmp/en-US/xml/Common_Content/Feedback.xml -> tmp/en-US/xml/Common_Content/Feedback.xml
	Processing file tmp/en-US/xml/Common_Content/Legal_Notice.xml -> tmp/en-US/xml/Common_Content/Legal_Notice.xml
	Processing file tmp/en-US/xml/Common_Content/Program_Listing.xml -> tmp/en-US/xml/Common_Content/Program_Listing.xml
	Processing file tmp/en-US/xml/Common_Content/Revision_History.xml -> tmp/en-US/xml/Common_Content/Revision_History.xml
	Processing file tmp/en-US/xml_tmp/Author_Group.xml -> tmp/en-US/xml/Author_Group.xml
	Processing file tmp/en-US/xml_tmp/Book_Info.xml -> tmp/en-US/xml/Book_Info.xml
	Processing file tmp/en-US/xml_tmp/Chapter.xml -> tmp/en-US/xml/Chapter.xml
	WARNING: Image missing: tmp/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.png
	Processing file tmp/en-US/xml_tmp/Preface.xml -> tmp/en-US/xml/Preface.xml
	Processing file tmp/en-US/xml_tmp/Revision_History.xml -> tmp/en-US/xml/Revision_History.xml
	Processing file tmp/en-US/xml_tmp/Test_3.2.xml -> tmp/en-US/xml/Test_3.2.xml
Beginning work on en-US
DTD Validation OK
	Starting html-single
	Using XML::LibXSLT on /usr/share/publican/xsl/html-single.xsl
	Finished html-single

$ ls tmp/en-US/html-single/images/
icon.svg  Login-รข??-OpenStack-Dashboard.png


Same result with "publican build --formats html --publish"

Comment 5 Jeff Fearn ๐Ÿž 2013-07-23 01:40:58 UTC
This is definitely working for me on RHEL6.

$ publican clean

$ cp en-US/images/cover_thumbnail.png en-US/images/Login-โ€“-OpenStack-Dashboard.png

$ publican build --formats html --publish 
...

$ ls build/en-US/xml/images/L*; ls build/en-US/html/images/L*; ls publish/en-US/Publican/3.2/html/Users_Guide/images/L*;
build/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.png
build/en-US/html/images/Login-โ€“-OpenStack-Dashboard.png
publish/en-US/Publican/3.2/html/Users_Guide/images/Login-โ€“-OpenStack-Dashboard.png

$ echo $LANG
en_US.utf8

Comment 6 Petr Bokoc 2013-07-24 12:00:44 UTC
Hey Jeff, Tomas asked me to test this as well and I can confirm that it still doesn't work - at least not on Fedora (17). And I get the same error whether publishing or just building into tmp/.

$ rpm -q publican
publican-3.1.5-0.fc17.t65.noarch

$ echo $LANG
en_US.utf8

$ ls en-US/images/L*
-rw-rw-r--. 1 pbokoc pbokoc 27577 Jul 15 15:56 en-US/images/Login-โ€“-OpenStack-Dashboard.jpg

$ publican clean

$ publican build --publish --langs=en-US --formats=html
Setting up en-US
        ...snip...
        WARNING: Image missing: tmp/en-US/xml/images/Login-โ€“-OpenStack-Dashboard.jpg
        ...snip...

$ ls publish/en-US/Documentation/0.1/html/Test_Book/images/L*
-rw-rw-r--. 1 pbokoc pbokoc 27577 Jul 24 13:54 publish/en-US/Documentation/0.1/html/Test_Book/images/Login-รข??-OpenStack-Dashboard.png.jpg

Comment 7 Petr Bokoc 2013-07-24 12:08:00 UTC
I think I found where the problem is. 

Jeff, you've been testing this by simply renaming an image. For this bug to manifest, you need to actually use the renamed image in the book. If you just rename the file and don't change anything else, everything works, because (I'm assuming) the file just gets copied from images/ to publish/en-US/images. If the image is used in the book with this filename, that's where the problem happens.

Comment 8 Petr Bokoc 2013-07-24 12:25:27 UTC
Ah, disregard the last comment, I rechecked again and it still happens every time even if the image isn't used in the book. I must have been seeing things.

Comment 9 Jeff Fearn ๐Ÿž 2013-08-06 05:35:59 UTC
I wrote wrapper functions for the downstream module to ensure utf8 is always set. Should definitely work now on all platforms.

To ssh://git.fedorahosted.org/git/publican.git
   42bfa0d..2fbf002  devel -> devel

Comment 10 Ruediger Landmann 2013-08-07 06:41:49 UTC
Verified in publican-3.1.5-0.fc19.t72.noarch with perl-Archive-Tar-1.90-3.fc19.noarch

Comment 11 Jeff Fearn ๐Ÿž 2013-08-09 04:49:09 UTC
The fix for this bug has been shipped in publican 3.2.0


Note You need to log in before you can comment on or make changes to this bug.