Bug 589333 - Publican doesn't follow standards for ePub files
Summary: Publican doesn't follow standards for ePub files
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 1.6
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Jeff Fearn 🐞
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-05 21:20 UTC by Jared Smith
Modified: 2010-11-24 04:19 UTC (History)
4 users (show)

Fixed In Version: 1.6.3
Clone Of:
Environment:
Last Closed: 2010-06-22 03:13:55 UTC
Embargoed:


Attachments (Terms of Use)

Description Jared Smith 2010-05-05 21:20:04 UTC
Description of problem:

When Publican builds the zip file for an ePub file, it does so in a way that is not compliant with the ePub standard.

Version-Release number of selected component (if applicable):

publican-1.6.2-0.fc13.noarch

How reproducible:

Every time

Steps to Reproduce:
1. pushd /tmp/
2. publican create --name testepub
3. publican build -f epub -l all
4. Run resulting epub against the online ePub validator at http://threepress.org/document/epub-validate/, or download epubcheck from http://code.google.com/p/epubcheck/ and run manually.
  
Actual results:

The "mimetype" file is not the first file in the Zip archive, and is most likely compressed.

Expected results:

The "mimetype" file should be the very first file in the Zip archive, and should not be compressed.

Additional info:

From the Linux command line, the proper way to create the Zip archive would be:

zip -q0X  testepub.epub mimetype
zip -qXr9D  testepub.epub *

I obviously don't know much about Perl's Archive::Zip module, and whether or not it will handle the above requirements.  It appears at first glance, however, that it would simply require something like:

        $member = $zip->addFile( "$tmp_dir/$lang/$format/mimetype" );
        $member->desiredCompressionMethod( COMPRESSION_STORED );

immedately after creating the zip archive, and then excluding the "mimetype" file when you call:

        my @filelist = File::Find::Rule->file->in(".");
        foreach my $file (@filelist) {
            $member = $zip->addFile($file);
        }

Thoughts?  Am I going about this the wrong way?

Comment 1 Jeff Fearn 🐞 2010-05-06 00:26:11 UTC
Hi, pretty close! I did it this way since it made it clear that mimetype is being treated special and it allows the remaining content to be compressed.

@@ -820,17 +820,18 @@
         $dir = pushd("$tmp_dir/$lang/$format");
 
         my $zip    = Archive::Zip->new();
+
+        my $mimetype = $zip->addFile( "mimetype" );
+        $mimetype->desiredCompressionMethod( COMPRESSION_STORED );
+
         my $member = $zip->addDirectory("OEBPS/");
         $member = $zip->addDirectory("META-INF/");
 
-##     $member = $zip->addFile( "$tmp_dir/$lang/$format/mimetype" );
-
-        my @filelist = File::Find::Rule->file->in(".");
+        my @filelist = File::Find::Rule->file->not_name('mimetype')->in(".");
         foreach my $file (@filelist) {
             $member = $zip->addFile($file);
         }

Also I had to make a few other changes to get rid of some other errors:

A: Changed paths to the CSS and common images since the validation tool chokes on './foo' so it has to be 'foo'.

B: The default xsl spams xml:lang and lang a lot, which is often invalid. So I copied it to our epub.xsl and commented it out since it's not really useful anyway.

C: xsl:template name="opf.manifest" doesn't handle print CSS, so I copied that to our epub.xsl and added xsl:if test="$html.stylesheet.print != ''". This should probably go upstream.

D: Modified xsl:template name="body.attributes" so if class is null it doesn't get set, instead of outputting class="".

E: The validator still complains about 'label' tag, but this is a valid tag, so I'm ignoring it.

Comment 2 Ruediger Landmann 2010-05-06 06:31:05 UTC
Errors noted in the original report no longer visible in EPUBs built with 1.6.3.t150

However, <preface>s and <appendix>es lead to "duplicate id" errors:

Fedora-13-Burning_ISO_images_to_disc-en-US.epub: could not parse OEBPS/appe-Burning_ISO_images_to_disc-Revision_History.html: duplicate id: appe-Burning_ISO_images_to_disc-Revision_History

And books seem to get a lot of "fragment identifier is not defined" errors:

Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(37): 'id377438': fragment identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html'

Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(43): 'id542324': fragment identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html' 

Finally, it looks like we'll have to look closely at the contents of the SVGs used in brands if we want EPUBs to validate, because these show a wide variety of errors.

Comment 3 Jeff Fearn 🐞 2010-05-10 06:20:37 UTC
(In reply to comment #2)
> Errors noted in the original report no longer visible in EPUBs built with
> 1.6.3.t150
> 
> However, <preface>s and <appendix>es lead to "duplicate id" errors:
> 
> Fedora-13-Burning_ISO_images_to_disc-en-US.epub: could not parse
> OEBPS/appe-Burning_ISO_images_to_disc-Revision_History.html: duplicate id:
> appe-Burning_ISO_images_to_disc-Revision_History

FIXED.

> And books seem to get a lot of "fragment identifier is not defined" errors:
> 
> Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(37): 'id377438': fragment
> identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html'
> 
> Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(43): 'id542324': fragment
> identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html' 

FIXED
 
> Finally, it looks like we'll have to look closely at the contents of the SVGs
> used in brands if we want EPUBs to validate, because these show a wide variety
> of errors.    

This error is wrong IMHO , the text in the svg is:

   version="1.0"

Which is correct according to:

http://www.w3.org/TR/2002/WD-SVG11-20020108/#version-att

Comment 4 Ruediger Landmann 2010-05-11 04:50:07 UTC
Yeah, "version" should be fine in SVGs according to the spec. Duplicate IDs and "fragment identifier not defined" errors verified fixed in version 1.6.2.t169


Note You need to log in before you can comment on or make changes to this bug.