Bug 733229
Summary: | specific python script looks like PASCAL program | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Petr Sklenar <psklenar> | ||||||||
Component: | file | Assignee: | Jan Kaluža <jkaluza> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE Security Team <qe-baseos-security> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 6.1 | CC: | azelinka, dapospis, ksrot, mfojtik, ovasik | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | file-5.04-12.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Previously, "magic" patterns for Python were insufficient. The file utility was therefore unable to detect a Python script according to the Python function definition. With this update, detection of Python is improved, and Python scripts are properly recognized.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 826900 830808 (view as bug list) | Environment: | |||||||||
Last Closed: | 2012-03-15 08:23:37 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 826900 | ||||||||||
Attachments: |
|
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. Created attachment 531134 [details]
proposed patch
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: File magic patterns for Python were insufficient. Consequence: File was not able to detect Python script according to the Python function definition. Fix: New magic pattern has been added to detect Python script according to the Python function definition. Result: Python detection improved. Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,7 +1 @@ -Cause: File magic patterns for Python were insufficient. +Previously, "magic" patterns for Python were insufficient. The file utility was therefore unable to detect a Python script according to the Python function definition. With this update, detection of Python is improved, and Python scripts are properly recognized.- -Consequence: File was not able to detect Python script according to the Python function definition. - -Fix: New magic pattern has been added to detect Python script according to the Python function definition. - -Result: Python detection improved. Created attachment 569432 [details]
sample files
I am afraid that the fix has introduced some regressions in text file detection.
Although not all files were identified properly with old file pkg, new file version does worse job with these files.
[ksrot@dhcp-30-102 samples]$ rpm -q file
file-5.04-13.el6.x86_64
[ksrot@dhcp-30-102 samples]$ file *
API_CHANGES.txt: Python script text executable
artist-tmpl.html: Python script text executable
capi.txt: Python script text executable
dstat-paper.txt: Python script text executable
extend.txt: Python script text executable
FAQ.txt: Python script text executable
index.txt: Python script text executable
INTERACTIVE: Python script text executable
lxml-ep2008.txt: Python script text executable
lxmlhtml.txt: Python script text executable
PKG-INFO: Python script text executable
pkg_resources.txt: Python script text executable
programmers-guide: Python script text executable
README.txt: Python script text executable
syntax.html: Python script text executable
tutorial.txt: Python script text executable
[ksrot@dhcp-30-102 samples]$ rpm -q file
file-5.04-11.el6.x86_64
[ksrot@dhcp-30-102 samples]$ file *
API_CHANGES.txt: ASCII English text
artist-tmpl.html: ASCII Java program text
capi.txt: FORTRAN program
dstat-paper.txt: ASCII English text
extend.txt: ASCII English text
FAQ.txt: ISO-8859 English text
index.txt: ASCII English text
INTERACTIVE: ASCII English text
lxml-ep2008.txt: UTF-8 Unicode English text
lxmlhtml.txt: ASCII English text
PKG-INFO: ASCII C++ program text
pkg_resources.txt: ASCII English text
programmers-guide: ASCII English text
README.txt: ASCII English text
syntax.html: ASCII English text, with very long lines
tutorial.txt: ASCII English text
OK, to give previous comment some context: On my filesystem new file version fixed recognition for ~150 python files and introduced "regression" for 27 files. Most of those text files actually contains pieces of Python code therefore the new recognition is not that odd. It might be more obvious from the full paths: /usr/lib64/python2.6/idlelib/extend.txt: /usr/lib64/rhythmbox/plugins/context/tmpl/artist-tmpl.html: /usr/lib/mailman/scripts/driver: /usr/lib/python2.6/site-packages/DecoratorTools-1.7-py2.6.egg-info/PKG-INFO: /usr/share/doc/dbus-python-0.83.0/API_CHANGES.txt: /usr/share/doc/dbus-python-0.83.0/tutorial.txt: /usr/share/doc/dstat-0.7.0/dstat-paper.txt: /usr/share/doc/numpy-1.3.0/docs-f2py/FAQ.txt: /usr/share/doc/pykickstart-1.74.6/programmers-guide: /usr/share/doc/python-lxml-2.2.3/doc/capi.txt: /usr/share/doc/python-lxml-2.2.3/doc/FAQ.txt: /usr/share/doc/python-lxml-2.2.3/doc/lxmlhtml.txt: /usr/share/doc/python-lxml-2.2.3/doc/performance.txt: /usr/share/doc/python-lxml-2.2.3/doc/s5/lxml-ep2008.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/filtering.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/namespaces.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/syntax.txt: /usr/share/doc/python-mako-0.3.4/doc/build/output/syntax.html: /usr/share/doc/python-mako-0.3.4/doc/build/templates/formatting.html: /usr/share/doc/python-matplotlib-0.99.1.2/INTERACTIVE: /usr/share/doc/python-nose-0.10.4/README.txt: /usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/pkg_resources.txt: /usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/setuptools.txt: /usr/share/doc/python-setuptools-0.6.10/docs/pkg_resources.txt: /usr/share/doc/python-setuptools-0.6.10/docs/setuptools.txt: /usr/share/doc/python-simplejson-2.0.9/docs/_sources/index.txt: /usr/src/debug/xulrunner-1.9.2.17/mozilla-1.9.2/js/src/prmjtime.cpp: I believe it makes sense to proceed with the fix. The File tool can't work 100% especially in the source code detection problem. I admit this patch brings some regressions, but it still improves detection of Python scripts a lot. With the way how File works it's not possible to do some in-depth detection which would be beneficial in cases like this one. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0391.html |
Created attachment 519784 [details] python script which looks like Pascal Description of problem: python script looks like PASCAL program Version-Release number of selected component (if applicable): file-5.04-9.el6.s390x How reproducible: deterministic Steps to Reproduce: RHEL5: # file gtk_label_autowrap.py gtk_label_autowrap.py: ASCII English text # rpm -q file file-4.17-15.el5_3.1.x86_64 RHEL6: # file gtk_label_autowrap.py gtk_label_autowrap.py: ASCII Pascal program text # rpm -q file file-5.04-9.el6.s390x Actual results: ASCII Pascal program Expected results: python script text executable Additional info: # rpm -qf /usr/share/system-config-printer/gtk_label_autowrap.py system-config-printer-1.1.16-22.el6.s390x