Description of problem: File will identify any text file starting with "Má" (iso-8859-1) as a MPEG-4 LO-EP audio stream. Since that character combination is fairly common on several languages, this can cause serious problems. Version-Release number of selected component (if applicable): file-4.17-9 How reproducible: Attached file Steps to Reproduce: 1. $ file magic-bug-sample.txt Actual results: magic-bug-sample.txt: MPEG-4 LO-EP audio stream Expected results: magic-bug-sample.txt: ISO-8859 text Additional info:
Created attachment 159888 [details] Sample file to reproduce the bug
You're right. I think we should rather remove support for the MPEG LO-EP format -- it's very rare and if it really can be determined by the first two bytes only, it makes more sense to report the text files correctly.
That is pretty much what I have been doing on my server. Removing the support for that format and recompiling the mgc file. Interesting too is the fact this only happens with the regular (non-mime) file detection. The mime-type detection shows the file correctly, which points to further problems, since there is a discrepancy between the regular and the mime-type magic files.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0208.html