Hide Forgot
Created attachment 519784 [details] python script which looks like Pascal Description of problem: python script looks like PASCAL program Version-Release number of selected component (if applicable): file-5.04-9.el6.s390x How reproducible: deterministic Steps to Reproduce: RHEL5: # file gtk_label_autowrap.py gtk_label_autowrap.py: ASCII English text # rpm -q file file-4.17-15.el5_3.1.x86_64 RHEL6: # file gtk_label_autowrap.py gtk_label_autowrap.py: ASCII Pascal program text # rpm -q file file-5.04-9.el6.s390x Actual results: ASCII Pascal program Expected results: python script text executable Additional info: # rpm -qf /usr/share/system-config-printer/gtk_label_autowrap.py system-config-printer-1.1.16-22.el6.s390x
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative.
Created attachment 531134 [details] proposed patch
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: File magic patterns for Python were insufficient. Consequence: File was not able to detect Python script according to the Python function definition. Fix: New magic pattern has been added to detect Python script according to the Python function definition. Result: Python detection improved.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,7 +1 @@ -Cause: File magic patterns for Python were insufficient. +Previously, "magic" patterns for Python were insufficient. The file utility was therefore unable to detect a Python script according to the Python function definition. With this update, detection of Python is improved, and Python scripts are properly recognized.- -Consequence: File was not able to detect Python script according to the Python function definition. - -Fix: New magic pattern has been added to detect Python script according to the Python function definition. - -Result: Python detection improved.
Created attachment 569432 [details] sample files I am afraid that the fix has introduced some regressions in text file detection. Although not all files were identified properly with old file pkg, new file version does worse job with these files. [ksrot@dhcp-30-102 samples]$ rpm -q file file-5.04-13.el6.x86_64 [ksrot@dhcp-30-102 samples]$ file * API_CHANGES.txt: Python script text executable artist-tmpl.html: Python script text executable capi.txt: Python script text executable dstat-paper.txt: Python script text executable extend.txt: Python script text executable FAQ.txt: Python script text executable index.txt: Python script text executable INTERACTIVE: Python script text executable lxml-ep2008.txt: Python script text executable lxmlhtml.txt: Python script text executable PKG-INFO: Python script text executable pkg_resources.txt: Python script text executable programmers-guide: Python script text executable README.txt: Python script text executable syntax.html: Python script text executable tutorial.txt: Python script text executable [ksrot@dhcp-30-102 samples]$ rpm -q file file-5.04-11.el6.x86_64 [ksrot@dhcp-30-102 samples]$ file * API_CHANGES.txt: ASCII English text artist-tmpl.html: ASCII Java program text capi.txt: FORTRAN program dstat-paper.txt: ASCII English text extend.txt: ASCII English text FAQ.txt: ISO-8859 English text index.txt: ASCII English text INTERACTIVE: ASCII English text lxml-ep2008.txt: UTF-8 Unicode English text lxmlhtml.txt: ASCII English text PKG-INFO: ASCII C++ program text pkg_resources.txt: ASCII English text programmers-guide: ASCII English text README.txt: ASCII English text syntax.html: ASCII English text, with very long lines tutorial.txt: ASCII English text
OK, to give previous comment some context: On my filesystem new file version fixed recognition for ~150 python files and introduced "regression" for 27 files. Most of those text files actually contains pieces of Python code therefore the new recognition is not that odd. It might be more obvious from the full paths: /usr/lib64/python2.6/idlelib/extend.txt: /usr/lib64/rhythmbox/plugins/context/tmpl/artist-tmpl.html: /usr/lib/mailman/scripts/driver: /usr/lib/python2.6/site-packages/DecoratorTools-1.7-py2.6.egg-info/PKG-INFO: /usr/share/doc/dbus-python-0.83.0/API_CHANGES.txt: /usr/share/doc/dbus-python-0.83.0/tutorial.txt: /usr/share/doc/dstat-0.7.0/dstat-paper.txt: /usr/share/doc/numpy-1.3.0/docs-f2py/FAQ.txt: /usr/share/doc/pykickstart-1.74.6/programmers-guide: /usr/share/doc/python-lxml-2.2.3/doc/capi.txt: /usr/share/doc/python-lxml-2.2.3/doc/FAQ.txt: /usr/share/doc/python-lxml-2.2.3/doc/lxmlhtml.txt: /usr/share/doc/python-lxml-2.2.3/doc/performance.txt: /usr/share/doc/python-lxml-2.2.3/doc/s5/lxml-ep2008.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/filtering.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/namespaces.txt: /usr/share/doc/python-mako-0.3.4/doc/build/content/syntax.txt: /usr/share/doc/python-mako-0.3.4/doc/build/output/syntax.html: /usr/share/doc/python-mako-0.3.4/doc/build/templates/formatting.html: /usr/share/doc/python-matplotlib-0.99.1.2/INTERACTIVE: /usr/share/doc/python-nose-0.10.4/README.txt: /usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/pkg_resources.txt: /usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/setuptools.txt: /usr/share/doc/python-setuptools-0.6.10/docs/pkg_resources.txt: /usr/share/doc/python-setuptools-0.6.10/docs/setuptools.txt: /usr/share/doc/python-simplejson-2.0.9/docs/_sources/index.txt: /usr/src/debug/xulrunner-1.9.2.17/mozilla-1.9.2/js/src/prmjtime.cpp: I believe it makes sense to proceed with the fix.
The File tool can't work 100% especially in the source code detection problem. I admit this patch brings some regressions, but it still improves detection of Python scripts a lot. With the way how File works it's not possible to do some in-depth detection which would be beneficial in cases like this one.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0391.html