Bug 589333 - Publican doesn't follow standards for ePub files
Publican doesn't follow standards for ePub files
Status: CLOSED ERRATA
Product: Publican
Classification: Community
Component: publican (Show other bugs)
1.6
All Linux
low Severity low
: ---
: ---
Assigned To: Jeff Fearn
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-05 17:20 EDT by Jared Smith
Modified: 2010-11-23 23:19 EST (History)
4 users (show)

See Also:
Fixed In Version: 1.6.3
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-06-21 23:13:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jared Smith 2010-05-05 17:20:04 EDT
Description of problem:

When Publican builds the zip file for an ePub file, it does so in a way that is not compliant with the ePub standard.

Version-Release number of selected component (if applicable):

publican-1.6.2-0.fc13.noarch

How reproducible:

Every time

Steps to Reproduce:
1. pushd /tmp/
2. publican create --name testepub
3. publican build -f epub -l all
4. Run resulting epub against the online ePub validator at http://threepress.org/document/epub-validate/, or download epubcheck from http://code.google.com/p/epubcheck/ and run manually.
  
Actual results:

The "mimetype" file is not the first file in the Zip archive, and is most likely compressed.

Expected results:

The "mimetype" file should be the very first file in the Zip archive, and should not be compressed.

Additional info:

From the Linux command line, the proper way to create the Zip archive would be:

zip -q0X  testepub.epub mimetype
zip -qXr9D  testepub.epub *

I obviously don't know much about Perl's Archive::Zip module, and whether or not it will handle the above requirements.  It appears at first glance, however, that it would simply require something like:

        $member = $zip->addFile( "$tmp_dir/$lang/$format/mimetype" );
        $member->desiredCompressionMethod( COMPRESSION_STORED );

immedately after creating the zip archive, and then excluding the "mimetype" file when you call:

        my @filelist = File::Find::Rule->file->in(".");
        foreach my $file (@filelist) {
            $member = $zip->addFile($file);
        }

Thoughts?  Am I going about this the wrong way?
Comment 1 Jeff Fearn 2010-05-05 20:26:11 EDT
Hi, pretty close! I did it this way since it made it clear that mimetype is being treated special and it allows the remaining content to be compressed.

@@ -820,17 +820,18 @@
         $dir = pushd("$tmp_dir/$lang/$format");
 
         my $zip    = Archive::Zip->new();
+
+        my $mimetype = $zip->addFile( "mimetype" );
+        $mimetype->desiredCompressionMethod( COMPRESSION_STORED );
+
         my $member = $zip->addDirectory("OEBPS/");
         $member = $zip->addDirectory("META-INF/");
 
-##     $member = $zip->addFile( "$tmp_dir/$lang/$format/mimetype" );
-
-        my @filelist = File::Find::Rule->file->in(".");
+        my @filelist = File::Find::Rule->file->not_name('mimetype')->in(".");
         foreach my $file (@filelist) {
             $member = $zip->addFile($file);
         }

Also I had to make a few other changes to get rid of some other errors:

A: Changed paths to the CSS and common images since the validation tool chokes on './foo' so it has to be 'foo'.

B: The default xsl spams xml:lang and lang a lot, which is often invalid. So I copied it to our epub.xsl and commented it out since it's not really useful anyway.

C: xsl:template name="opf.manifest" doesn't handle print CSS, so I copied that to our epub.xsl and added xsl:if test="$html.stylesheet.print != ''". This should probably go upstream.

D: Modified xsl:template name="body.attributes" so if class is null it doesn't get set, instead of outputting class="".

E: The validator still complains about 'label' tag, but this is a valid tag, so I'm ignoring it.
Comment 2 Ruediger Landmann 2010-05-06 02:31:05 EDT
Errors noted in the original report no longer visible in EPUBs built with 1.6.3.t150

However, <preface>s and <appendix>es lead to "duplicate id" errors:

Fedora-13-Burning_ISO_images_to_disc-en-US.epub: could not parse OEBPS/appe-Burning_ISO_images_to_disc-Revision_History.html: duplicate id: appe-Burning_ISO_images_to_disc-Revision_History

And books seem to get a lot of "fragment identifier is not defined" errors:

Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(37): 'id377438': fragment identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html'

Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(43): 'id542324': fragment identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html' 

Finally, it looks like we'll have to look closely at the contents of the SVGs used in brands if we want EPUBs to validate, because these show a wide variety of errors.
Comment 3 Jeff Fearn 2010-05-10 02:20:37 EDT
(In reply to comment #2)
> Errors noted in the original report no longer visible in EPUBs built with
> 1.6.3.t150
> 
> However, <preface>s and <appendix>es lead to "duplicate id" errors:
> 
> Fedora-13-Burning_ISO_images_to_disc-en-US.epub: could not parse
> OEBPS/appe-Burning_ISO_images_to_disc-Revision_History.html: duplicate id:
> appe-Burning_ISO_images_to_disc-Revision_History

FIXED.

> And books seem to get a lot of "fragment identifier is not defined" errors:
> 
> Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(37): 'id377438': fragment
> identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html'
> 
> Fedora-13-User_Guide-en-US.epub/OEBPS/toc.ncx(43): 'id542324': fragment
> identifier is not defined in 'OEBPS/pref-User_Guide-Preface.html' 

FIXED
 
> Finally, it looks like we'll have to look closely at the contents of the SVGs
> used in brands if we want EPUBs to validate, because these show a wide variety
> of errors.    

This error is wrong IMHO , the text in the svg is:

   version="1.0"

Which is correct according to:

http://www.w3.org/TR/2002/WD-SVG11-20020108/#version-att
Comment 4 Ruediger Landmann 2010-05-11 00:50:07 EDT
Yeah, "version" should be fine in SVGs according to the spec. Duplicate IDs and "fragment identifier not defined" errors verified fixed in version 1.6.2.t169

Note You need to log in before you can comment on or make changes to this bug.