Bug 2020715 - The file command identifies a JSON file as "JSON data" without including "text" in the output
Summary: The file command identifies a JSON file as "JSON data" without including "tex...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: file
Version: 35
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Vincent Mihalkovič
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-05 17:29 UTC by Erkki Ruohtula
Modified: 2021-12-14 12:36 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2021-12-14 12:36:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Erkki Ruohtula 2021-11-05 17:29:45 UTC
Description of problem:

The command "file" applied to a JSON file now outputs "JSON data", without
including the string "text" in the output. Earlier versions behaved differently,
and this is also contrary to the documentation of file (man file), which states

     The type printed will usually contain one of the words text (the file
     contains only printing characters and a few common control characters and
     is probably safe to read on an ASCII terminal), executable (the file con‐
     tains the result of compiling a program in a form understandable to some
     UNIX kernel or another), or data meaning anything else (data is usually
     “binary” or non-printable).  Exceptions are well-known file formats (core
     files, tar archives) that are known to contain binary data.  When modify‐
     ing magic files or the program itself, make sure to preserve these
     keywords.  Users depend on knowing that all the readable files in a di‐
     rectory have the word “text” printed.  Don't do as Berkeley did and
     change “shell commands text” to “shell script”.

Apparently file has now done as Berkeley did...

Version-Release number of selected component (if applicable):

file-5.39

How reproducible:

Steps to Reproduce:
1. Create a file with JSON contents, eg.

{
	"a": 1,
	"b": 2
}

2. Apply the command file to it.

Actual results:

$ file foo.json 
foo.json: JSON data

Expected results:

Output should contain "text" somewhere (it could contain also "JSON").
Eg. on CentOS 6 file returned just "ASCII text" for this file.
The change broke a script that used "file" to find plaintext files.

Comment 1 Ben Cotton 2021-11-30 16:12:22 UTC
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 2 Erkki Ruohtula 2021-12-07 17:16:13 UTC
The problem can be reproduced also on Fedora 35, file-5.40

Comment 3 Vincent Mihalkovič 2021-12-10 14:05:52 UTC
Hi, 

so I talked to upstream and the result is this commit: https://github.com/file/file/commit/c49e7805fd8aa48b8d2afad98d2115560ffaaf21

We change the output from "JSON data" to "JSON text data.

Comment 4 Erkki Ruohtula 2021-12-10 17:49:57 UTC
Thanks, the correction looks good.

Comment 5 Vincent Mihalkovič 2021-12-14 12:36:29 UTC
dist-git commit: https://src.fedoraproject.org/rpms/file/c/411766c4848c80eb8d94b0e12f48013e7ceb9de1


Note You need to log in before you can comment on or make changes to this bug.