Bug 849621 - file is coming back with 'LaTeX document text' instead of 'XML document text'
file is coming back with 'LaTeX document text' instead of 'XML document text'
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: file (Show other bugs)
x86_64 Linux
unspecified Severity unspecified
: rc
: ---
Assigned To: Jan Kaluža
Stanislav Zidek
Depends On:
Blocks: 849641
  Show dependency treegraph
Reported: 2012-08-20 07:27 EDT by james.hn.sears
Modified: 2014-10-14 04:29 EDT (History)
2 users (show)

See Also:
Fixed In Version: file-5.04-16.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 849641 (view as bug list)
Last Closed: 2014-10-14 04:29:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
'LaTeX' example (219 bytes, text/xml)
2012-08-20 07:27 EDT, james.hn.sears
no flags Details
proposed patch (1.19 KB, patch)
2012-08-20 08:33 EDT, Jan Kaluža
no flags Details | Diff
a.xml (227 bytes, application/octet-stream)
2012-08-20 11:22 EDT, james.hn.sears
no flags Details

  None (edit)
Description james.hn.sears 2012-08-20 07:27:17 EDT
Created attachment 605678 [details]
'LaTeX' example

I've noticed that file is, possibly, coming back with the wrong mime type - I was expecting 'XML document text' but instead get 'LaTeX document text'.

Ubuntu - various releases - comes back with 'XML document text', why isn't 6.3?

I've attached an example that produces the 'wrong' response from file.
Comment 2 Jan Kaluža 2012-08-20 08:32:30 EDT
I think this is not valid XML:

<?version xml="1.0" encoding="UTF-8"?>

Proper DTD is:

<?xml version="1.0" encoding="UTF-8"?>

Although even with the proper header it doesn't work correctly, because it misdetecs "\chapter" as LaTeX command. Attached patch against file-5.11 fixes that.
Comment 3 Jan Kaluža 2012-08-20 08:33:01 EDT
Created attachment 605696 [details]
proposed patch
Comment 4 james.hn.sears 2012-08-20 10:07:52 EDT
Hi Jan - thank you for getting back to me so promptly.

I went into /usr/share/misc/magic, manuall patched, and rebuilt a .mgc file via: file -C -m 

Before I did this I ran file against the files that are producing this LaTex problem - I got 21 misdetecs.

After I manually applied the patch I got 1 misdetec.

I think the proposed patch needs a bit more work.

- apologies for typo in the first attachement on the metadataInfo element.
- my OS version is 2.6.32-279.5.1.el6.x86_64 if that helps.
Comment 5 Jan Kaluža 2012-08-20 10:20:05 EDT
Can you attach the file for which it's still broken?
Comment 6 james.hn.sears 2012-08-20 11:22:18 EDT
Created attachment 605721 [details]

Hi Jan - I can't provide you with any original file; however by using vi I've managed to strip away our propietary information yet still keep what's causing 'file' to come out with the 'wrong' answer.

Put another way, when I use 'file' on the attached a.xml I get 'LaTeX' whether I use the defaul magic database or the one I patched (by following your instructions).

If it helps I can, relatively easily, regression test another patch of yours.
Comment 7 Jan Kaluža 2012-08-21 01:47:33 EDT
In the last attachment there's still "<?version xml="1.0" encoding="UTF-8"?>". This is not valid XML. If I change it to "<?xml version="1.0" encoding="UTF-8"?>", patched File is able to detect that file.
Comment 8 james.hn.sears 2012-08-21 04:34:57 EDT
Hi Jan - after doing some more research I agree with you. The a.xml file is not valid when compared against http://www.w3.org/TR/REC-xml/#sec-prolog-dtd

Hence the patch, at least from my perspective, works.

Thank you for your help.

Any estimate as to when the patch will be released?
Comment 11 Jindrich Novy 2012-09-08 06:21:09 EDT
Not sure how this bug is related to latex2html. Reassigning back to file(1).
Comment 13 james.hn.sears 2013-01-14 10:16:19 EST
Hi - for the sake of history, just want to record the fact that we've got a similar (magic database) related defect at: https://bugzilla.redhat.com/show_bug.cgi?id=873997

There was a case for this - https://access.redhat.com/support/cases/00742615 - but at the moment we haven't upgraded our support package away from 'self support'.
Comment 23 errata-xmlrpc 2014-10-14 04:29:06 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.