Bug 2029975

Summary: Regression: file no longer detects javascript executables as application/javascript
Product: [Fedora] Fedora Reporter: Miro Hrončok <mhroncok>
Component: fileAssignee: Vincent Mihalkovič <vmihalko>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jkaluza, kdudka, odubaj, svashisht, vmihalko
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: file-5.41-2.fc36 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-08 15:07:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Miro Hrončok 2021-12-07 17:38:05 UTC
Description of problem:
/usr/lib/node_modules/yarn/bin/yarn.js and other JavaScript files used to be detected as application/javascript in Fedora 34/35, but are detected as text/plain in Fedora 36.

Version-Release number of selected component:
file-5.41-1.fc36.x86_64

How reproducible: Easy

Steps to Reproduce:
1. mock -r fedora-36-x86_64 install yarnpkg file
2. mock -r fedora-36-x86_64 shell
3. file --mime-type /usr/lib/node_modules/yarn/bin/yarn.js

(You can also repeat with fedora-35-x86_64 for comparison.)

Actual results:
/usr/lib/node_modules/yarn/bin/yarn.js: text/plain

Expected results:
/usr/lib/node_modules/yarn/bin/yarn.js: application/javascript

Additional info:
I see no relevant notice in https://fossies.org/linux/misc/file-5.41.tar.gz/file-5.41/ChangeLog so I assume this is a regression.

Comment 1 Vincent Mihalkovič 2021-12-07 21:27:54 UTC
This regression was introduced with file-5.41-1.fc36, in file-5.40-9.fc35 JavaScript detection works as expected.

The problematic commit & line is https://github.com/file/file/commit/c07b2a18eb1c5d3854e3ecc72319a2336e361d9e#diff-85466710385fb2ac02303e18020a937c563abbea6d4050ba3aff96cf6c8e6866R100
This "wild-card match for interpreters" powerful (with huge strength) pattern is cause of the regression. After running file --checking-printout --list:

100: > 0 string/wt,=#! ,"a"]                                                <-- used detection pattern
101: >> 1 string,x,"%s script text executable"] 
...
16: > 0 search/1,=#!/usr/bin/env nodejs,"Node.js script text executable"]   <-- expected detection pattern

I'm going to ask the upstream about the proper fix for this - whether to increase the strength of the JavaScript detection patterns or to remove the "wild-card match for interpreters" pattern...

Comment 3 Miro Hrončok 2021-12-08 15:13:56 UTC
That was fast. Thanks!