Bug 733229 - specific python script looks like PASCAL program
Summary: specific python script looks like PASCAL program
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: file
Version: 6.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Jan Kaluža
QA Contact: BaseOS QE Security Team
URL:
Whiteboard:
Depends On:
Blocks: 826900
TreeView+ depends on / blocked
 
Reported: 2011-08-25 08:18 UTC by Petr Sklenar ⛄
Modified: 2012-11-05 15:33 UTC (History)
5 users (show)

Fixed In Version: file-5.04-12.el6
Doc Type: Bug Fix
Doc Text:
Previously, "magic" patterns for Python were insufficient. The file utility was therefore unable to detect a Python script according to the Python function definition. With this update, detection of Python is improved, and Python scripts are properly recognized.
Clone Of:
: 826900 830808 (view as bug list)
Environment:
Last Closed: 2012-03-15 08:23:37 UTC


Attachments (Terms of Use)
python script which looks like Pascal (2.24 KB, text/plain)
2011-08-25 08:18 UTC, Petr Sklenar ⛄
no flags Details
proposed patch (2.15 KB, patch)
2011-11-01 13:40 UTC, Jan Kaluža
no flags Details | Diff
sample files (97.90 KB, application/x-gzip)
2012-03-12 15:12 UTC, Karel Srot
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0391 normal SHIPPED_LIVE file bug fix update 2012-03-15 12:23:01 UTC

Description Petr Sklenar ⛄ 2011-08-25 08:18:09 UTC
Created attachment 519784 [details]
python script which looks like Pascal

Description of problem:
python script looks like PASCAL program

Version-Release number of selected component (if applicable):
file-5.04-9.el6.s390x

How reproducible:
deterministic

Steps to Reproduce:
RHEL5:
# file gtk_label_autowrap.py 
gtk_label_autowrap.py: ASCII English text
# rpm -q file
file-4.17-15.el5_3.1.x86_64

RHEL6:
# file gtk_label_autowrap.py
gtk_label_autowrap.py: ASCII Pascal program text
# rpm -q file
file-5.04-9.el6.s390x


Actual results:
ASCII Pascal program

Expected results:
python script text executable


Additional info:
# rpm -qf /usr/share/system-config-printer/gtk_label_autowrap.py
system-config-printer-1.1.16-22.el6.s390x

Comment 1 RHEL Product and Program Management 2011-08-25 08:28:03 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 2 Jan Kaluža 2011-11-01 13:40:27 UTC
Created attachment 531134 [details]
proposed patch

Comment 6 Jan Kaluža 2012-02-09 07:53:33 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: File magic patterns for Python were insufficient.

Consequence: File was not able to detect Python script according to the Python function definition.

Fix: New magic pattern has been added to detect Python script according to the Python function definition.

Result: Python detection improved.

Comment 8 Michal Fojtik 2012-02-13 12:54:27 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,7 +1 @@
-Cause: File magic patterns for Python were insufficient.
+Previously, "magic" patterns for Python were insufficient. The file utility was therefore unable to detect a Python script according to the Python function definition. With this update, detection of Python is improved, and Python scripts are properly recognized.-
-Consequence: File was not able to detect Python script according to the Python function definition.
-
-Fix: New magic pattern has been added to detect Python script according to the Python function definition.
-
-Result: Python detection improved.

Comment 10 Karel Srot 2012-03-12 15:12:16 UTC
Created attachment 569432 [details]
sample files

I am afraid that the fix has introduced some regressions in text file detection.
Although not all files were identified properly with old file pkg, new file version does worse job with these files.

[ksrot@dhcp-30-102 samples]$ rpm -q file
file-5.04-13.el6.x86_64
[ksrot@dhcp-30-102 samples]$ file *
API_CHANGES.txt:   Python script text executable
artist-tmpl.html:  Python script text executable
capi.txt:          Python script text executable
dstat-paper.txt:   Python script text executable
extend.txt:        Python script text executable
FAQ.txt:           Python script text executable
index.txt:         Python script text executable
INTERACTIVE:       Python script text executable
lxml-ep2008.txt:   Python script text executable
lxmlhtml.txt:      Python script text executable
PKG-INFO:          Python script text executable
pkg_resources.txt: Python script text executable
programmers-guide: Python script text executable
README.txt:        Python script text executable
syntax.html:       Python script text executable
tutorial.txt:      Python script text executable


[ksrot@dhcp-30-102 samples]$ rpm -q file
file-5.04-11.el6.x86_64
[ksrot@dhcp-30-102 samples]$ file *
API_CHANGES.txt:   ASCII English text
artist-tmpl.html:  ASCII Java program text
capi.txt:          FORTRAN program
dstat-paper.txt:   ASCII English text
extend.txt:        ASCII English text
FAQ.txt:           ISO-8859 English text
index.txt:         ASCII English text
INTERACTIVE:       ASCII English text
lxml-ep2008.txt:   UTF-8 Unicode English text
lxmlhtml.txt:      ASCII English text
PKG-INFO:          ASCII C++ program text
pkg_resources.txt: ASCII English text
programmers-guide: ASCII English text
README.txt:        ASCII English text
syntax.html:       ASCII English text, with very long lines
tutorial.txt:      ASCII English text

Comment 11 Karel Srot 2012-03-13 10:16:45 UTC
OK, to give previous comment some context:
On my filesystem new file version fixed recognition for ~150 python files and introduced "regression" for 27 files. Most of those text files actually contains pieces of Python code therefore the new recognition is not that odd. It might be more obvious from the full paths:

/usr/lib64/python2.6/idlelib/extend.txt:
/usr/lib64/rhythmbox/plugins/context/tmpl/artist-tmpl.html:
/usr/lib/mailman/scripts/driver:
/usr/lib/python2.6/site-packages/DecoratorTools-1.7-py2.6.egg-info/PKG-INFO:
/usr/share/doc/dbus-python-0.83.0/API_CHANGES.txt:
/usr/share/doc/dbus-python-0.83.0/tutorial.txt:
/usr/share/doc/dstat-0.7.0/dstat-paper.txt:
/usr/share/doc/numpy-1.3.0/docs-f2py/FAQ.txt:
/usr/share/doc/pykickstart-1.74.6/programmers-guide:
/usr/share/doc/python-lxml-2.2.3/doc/capi.txt:
/usr/share/doc/python-lxml-2.2.3/doc/FAQ.txt:
/usr/share/doc/python-lxml-2.2.3/doc/lxmlhtml.txt:
/usr/share/doc/python-lxml-2.2.3/doc/performance.txt:
/usr/share/doc/python-lxml-2.2.3/doc/s5/lxml-ep2008.txt:
/usr/share/doc/python-mako-0.3.4/doc/build/content/filtering.txt:
/usr/share/doc/python-mako-0.3.4/doc/build/content/namespaces.txt:
/usr/share/doc/python-mako-0.3.4/doc/build/content/syntax.txt:
/usr/share/doc/python-mako-0.3.4/doc/build/output/syntax.html:
/usr/share/doc/python-mako-0.3.4/doc/build/templates/formatting.html:
/usr/share/doc/python-matplotlib-0.99.1.2/INTERACTIVE:
/usr/share/doc/python-nose-0.10.4/README.txt:
/usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/pkg_resources.txt:
/usr/share/doc/python-setuptools-0.6.10/docs/build/html/_sources/setuptools.txt:
/usr/share/doc/python-setuptools-0.6.10/docs/pkg_resources.txt:
/usr/share/doc/python-setuptools-0.6.10/docs/setuptools.txt:
/usr/share/doc/python-simplejson-2.0.9/docs/_sources/index.txt:
/usr/src/debug/xulrunner-1.9.2.17/mozilla-1.9.2/js/src/prmjtime.cpp:

I believe it makes sense to proceed with the fix.

Comment 12 Jan Kaluža 2012-03-13 10:35:56 UTC
The File tool can't work 100% especially in the source code detection problem. I admit this patch brings some regressions, but it still improves detection of Python scripts a lot. With the way how File works it's not possible to do some in-depth detection which would be beneficial in cases like this one.

Comment 13 errata-xmlrpc 2012-03-15 08:23:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0391.html


Note You need to log in before you can comment on or make changes to this bug.