Bug 849621
Summary: | file is coming back with 'LaTeX document text' instead of 'XML document text' | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | james.hn.sears | ||||||||
Component: | file | Assignee: | Jan Kaluža <jkaluza> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Stanislav Zidek <szidek> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 6.3 | CC: | ksrot, szidek | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | file-5.04-16.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 849641 (view as bug list) | Environment: | |||||||||
Last Closed: | 2014-10-14 08:29:06 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 849641 | ||||||||||
Attachments: |
|
I think this is not valid XML: <?version xml="1.0" encoding="UTF-8"?> Proper DTD is: <?xml version="1.0" encoding="UTF-8"?> Although even with the proper header it doesn't work correctly, because it misdetecs "\chapter" as LaTeX command. Attached patch against file-5.11 fixes that. Created attachment 605696 [details]
proposed patch
Hi Jan - thank you for getting back to me so promptly. I went into /usr/share/misc/magic, manuall patched, and rebuilt a .mgc file via: file -C -m Before I did this I ran file against the files that are producing this LaTex problem - I got 21 misdetecs. After I manually applied the patch I got 1 misdetec. I think the proposed patch needs a bit more work. PS - apologies for typo in the first attachement on the metadataInfo element. - my OS version is 2.6.32-279.5.1.el6.x86_64 if that helps. Can you attach the file for which it's still broken? Created attachment 605721 [details]
a.xml
Hi Jan - I can't provide you with any original file; however by using vi I've managed to strip away our propietary information yet still keep what's causing 'file' to come out with the 'wrong' answer.
Put another way, when I use 'file' on the attached a.xml I get 'LaTeX' whether I use the defaul magic database or the one I patched (by following your instructions).
If it helps I can, relatively easily, regression test another patch of yours.
In the last attachment there's still "<?version xml="1.0" encoding="UTF-8"?>". This is not valid XML. If I change it to "<?xml version="1.0" encoding="UTF-8"?>", patched File is able to detect that file. Hi Jan - after doing some more research I agree with you. The a.xml file is not valid when compared against http://www.w3.org/TR/REC-xml/#sec-prolog-dtd Hence the patch, at least from my perspective, works. Thank you for your help. Any estimate as to when the patch will be released? Not sure how this bug is related to latex2html. Reassigning back to file(1). Hi - for the sake of history, just want to record the fact that we've got a similar (magic database) related defect at: https://bugzilla.redhat.com/show_bug.cgi?id=873997 There was a case for this - https://access.redhat.com/support/cases/00742615 - but at the moment we haven't upgraded our support package away from 'self support'. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-1606.html |
Created attachment 605678 [details] 'LaTeX' example I've noticed that file is, possibly, coming back with the wrong mime type - I was expecting 'XML document text' but instead get 'LaTeX document text'. Ubuntu - various releases - comes back with 'XML document text', why isn't 6.3? I've attached an example that produces the 'wrong' response from file.