Bug 691246

Summary: file(1) does not recognize .epub e-books (patch attached)
Product: [Fedora] Fedora Reporter: Jan "Yenya" Kasprzak <kas>
Component: fileAssignee: Jan Kaluža <jkaluza>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 14CC: jkaluza
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: file-5.07-2.fc15 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-25 02:30:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jan "Yenya" Kasprzak 2011-03-27 20:57:51 UTC
Description of problem:
file(1) does not recognize epub e-books

Version-Release number of selected component (if applicable):
file-5.04-16.fc14.x86_64

I have fixed this using the following patch:
--- a/usr/share/magic	2011-02-10 10:11:13.000000000 +0100
+++ b/usr/share/magic	2011-03-27 22:32:59.171800238 +0200
@@ -2055,6 +2055,11 @@
 >>>>78	string	-template		Template
 !:mime	application/vnd.oasis.opendocument.image-template
 
+# Added by Jan "Yenya" Kasprzak <kas.cz>, based on output of Callibre
+# (http://callibre-ebook.com).
+>>38	string	application/epub+zip	EPUB Electronic Book
+!:mime	application/epub+zip
+
 # Zoo archiver
 20	lelong		0xfdc4a7dc	Zoo archive data
 !:mime	application/x-zoo

If possible, please add the above entry and send it upstream (I am not sure where the upstream is).

Additional info:
It would be nice if file(1) could say something other than "data" for ZIP archives with MIME type - at least something like "ZIP archive with unknown MIME type %s" (and use this mime type with file --mimetype). I am not sure how to do this with magic(5) syntax, because the MIME type is not null-terminated. So the following rule prints "application/...PK\003\004\024", because after the actual MIME type there is another PK entry:

--- a/usr/share/magic	2011-02-10 10:11:13.000000000 +0100
+++ b/usr/share/magic	2011-03-27 22:53:33.168386361 +0200
@@ -2055,6 +2055,10 @@
 >>>>78	string	-template		Template
 !:mime	application/vnd.oasis.opendocument.image-template
 
+# If still not known, print at least the MIME type from the archive
+>>38	string	>\0		ZIP archive with MIME type %s
+# !:mime  %s
+
 # Zoo archiver
 20	lelong		0xfdc4a7dc	Zoo archive data
 !:mime	application/x-zoo

FWIW, the od -tx1 somefile.epub gives the following:
0000000 50 4b 03 04 14 00 00 08 00 00 74 aa 7b 3e 6f 61
0000020 ab 2c 14 00 00 00 14 00 00 00 08 00 00 00 6d 69
0000040 6d 65 74 79 70 65 61 70 70 6c 69 63 61 74 69 6f
0000060 6e 2f 65 70 75 62 2b 7a 69 70 50 4b 03 04 14 00
0000100 00 08 08 00 74 aa 7b 3e 00 00 00 00 02 00 00 00
0000120 00 00 00 00 09 00 00 00 4d 45 54 41 2d 49 4e 46
0000140 2f 03 00 50 4b 03 04 14 00 00 08 08 00 74 aa 7b

Comment 1 Jan Kaluža 2011-03-28 09:48:25 UTC
I've just checked upstream repository (https://github.com/glensc/file) and support of both formats is already there. It should be matched by these patterns:

#  Catch other ZIP-with-mimetype formats
#	In a ZIP file, the bytes immediately after a member's contents are
#	always "PK". The 2 regex rules here print the "mimetype" member's
#	contents up to the first 'P'. Luckily, most MIME types don't contain
#	any capital 'P's. This is a kludge.
#    (mimetype contains "application/<OTHER>")
>>50		string	!epub+zip
>>>50		string	!vnd.oasis.opendocument.
>>>>50		string	!vnd.sun.xml.
>>>>>50		string	!vnd.kde.
>>>>>>38	regex	[!-OQ-~]+		Zip data (MIME type "%s"?)
!:mime	application/zip
#    (mimetype contents other than "application/*")
>26		string	\x8\0\0\0mimetype
>>38		string	!application/
>>>38		regex	[!-OQ-~]+		Zip data (MIME type "%s"?)
!:mime	application/zip

Comment 2 Jan "Yenya" Kasprzak 2011-03-28 10:06:43 UTC
Excellent. Just make sure the Fedora package gets updated to the upstream version eventually :-).

Thanks!

Comment 3 Jan Kaluža 2011-05-11 06:57:38 UTC
I updated to file 5.07 which fixes your issue in rawhide.

Comment 4 Fedora Update System 2011-05-11 08:54:15 UTC
file-5.07-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/file-5.07-1.fc15

Comment 5 Fedora Update System 2011-05-14 03:08:19 UTC
Package file-5.07-1.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing file-5.07-1.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/file-5.07-1.fc15
then log in and leave karma (feedback).

Comment 6 Fedora Update System 2011-05-23 11:11:32 UTC
file-5.07-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/file-5.07-2.fc15

Comment 7 Fedora Update System 2011-05-25 02:30:12 UTC
file-5.07-2.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.