Bug 691246 - file(1) does not recognize .epub e-books (patch attached)
Summary: file(1) does not recognize .epub e-books (patch attached)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: file
Version: 14
Hardware: All
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Jan Kaluža
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-27 20:57 UTC by Jan "Yenya" Kasprzak
Modified: 2011-05-25 02:30 UTC (History)
1 user (show)

Fixed In Version: file-5.07-2.fc15
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-05-25 02:30:45 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jan "Yenya" Kasprzak 2011-03-27 20:57:51 UTC
Description of problem:
file(1) does not recognize epub e-books

Version-Release number of selected component (if applicable):
file-5.04-16.fc14.x86_64

I have fixed this using the following patch:
--- a/usr/share/magic	2011-02-10 10:11:13.000000000 +0100
+++ b/usr/share/magic	2011-03-27 22:32:59.171800238 +0200
@@ -2055,6 +2055,11 @@
 >>>>78	string	-template		Template
 !:mime	application/vnd.oasis.opendocument.image-template
 
+# Added by Jan "Yenya" Kasprzak <kas.cz>, based on output of Callibre
+# (http://callibre-ebook.com).
+>>38	string	application/epub+zip	EPUB Electronic Book
+!:mime	application/epub+zip
+
 # Zoo archiver
 20	lelong		0xfdc4a7dc	Zoo archive data
 !:mime	application/x-zoo

If possible, please add the above entry and send it upstream (I am not sure where the upstream is).

Additional info:
It would be nice if file(1) could say something other than "data" for ZIP archives with MIME type - at least something like "ZIP archive with unknown MIME type %s" (and use this mime type with file --mimetype). I am not sure how to do this with magic(5) syntax, because the MIME type is not null-terminated. So the following rule prints "application/...PK\003\004\024", because after the actual MIME type there is another PK entry:

--- a/usr/share/magic	2011-02-10 10:11:13.000000000 +0100
+++ b/usr/share/magic	2011-03-27 22:53:33.168386361 +0200
@@ -2055,6 +2055,10 @@
 >>>>78	string	-template		Template
 !:mime	application/vnd.oasis.opendocument.image-template
 
+# If still not known, print at least the MIME type from the archive
+>>38	string	>\0		ZIP archive with MIME type %s
+# !:mime  %s
+
 # Zoo archiver
 20	lelong		0xfdc4a7dc	Zoo archive data
 !:mime	application/x-zoo

FWIW, the od -tx1 somefile.epub gives the following:
0000000 50 4b 03 04 14 00 00 08 00 00 74 aa 7b 3e 6f 61
0000020 ab 2c 14 00 00 00 14 00 00 00 08 00 00 00 6d 69
0000040 6d 65 74 79 70 65 61 70 70 6c 69 63 61 74 69 6f
0000060 6e 2f 65 70 75 62 2b 7a 69 70 50 4b 03 04 14 00
0000100 00 08 08 00 74 aa 7b 3e 00 00 00 00 02 00 00 00
0000120 00 00 00 00 09 00 00 00 4d 45 54 41 2d 49 4e 46
0000140 2f 03 00 50 4b 03 04 14 00 00 08 08 00 74 aa 7b

Comment 1 Jan Kaluža 2011-03-28 09:48:25 UTC
I've just checked upstream repository (https://github.com/glensc/file) and support of both formats is already there. It should be matched by these patterns:

#  Catch other ZIP-with-mimetype formats
#	In a ZIP file, the bytes immediately after a member's contents are
#	always "PK". The 2 regex rules here print the "mimetype" member's
#	contents up to the first 'P'. Luckily, most MIME types don't contain
#	any capital 'P's. This is a kludge.
#    (mimetype contains "application/<OTHER>")
>>50		string	!epub+zip
>>>50		string	!vnd.oasis.opendocument.
>>>>50		string	!vnd.sun.xml.
>>>>>50		string	!vnd.kde.
>>>>>>38	regex	[!-OQ-~]+		Zip data (MIME type "%s"?)
!:mime	application/zip
#    (mimetype contents other than "application/*")
>26		string	\x8\0\0\0mimetype
>>38		string	!application/
>>>38		regex	[!-OQ-~]+		Zip data (MIME type "%s"?)
!:mime	application/zip

Comment 2 Jan "Yenya" Kasprzak 2011-03-28 10:06:43 UTC
Excellent. Just make sure the Fedora package gets updated to the upstream version eventually :-).

Thanks!

Comment 3 Jan Kaluža 2011-05-11 06:57:38 UTC
I updated to file 5.07 which fixes your issue in rawhide.

Comment 4 Fedora Update System 2011-05-11 08:54:15 UTC
file-5.07-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/file-5.07-1.fc15

Comment 5 Fedora Update System 2011-05-14 03:08:19 UTC
Package file-5.07-1.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing file-5.07-1.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/file-5.07-1.fc15
then log in and leave karma (feedback).

Comment 6 Fedora Update System 2011-05-23 11:11:32 UTC
file-5.07-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/file-5.07-2.fc15

Comment 7 Fedora Update System 2011-05-25 02:30:12 UTC
file-5.07-2.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.