Bug 465994 - file --mime-encoding seems broken
file --mime-encoding seems broken
Product: Fedora
Classification: Fedora
Component: file (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Daniel Novotny
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2008-10-07 13:07 EDT by Lubomir Rintel
Modified: 2008-10-16 07:04 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-10-16 07:04:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
a patch for --mime-encoding (1.27 KB, patch)
2008-10-16 05:55 EDT, Daniel Novotny
no flags Details | Diff

  None (edit)
Description Lubomir Rintel 2008-10-07 13:07:09 EDT
While investigating a regression in file --mime behavior (type and encoding are no longer separated with ";", unlike in 4.17), I noticed a behavior that seems somewhat odd to me:

[lkundrak@trurl ~]$ file --mime-type /etc/passwd
/etc/passwd: text/plain
[lkundrak@trurl ~]$ file --mime-type --mime-encoding /etc/passwd
/etc/passwd: text/plain charset=us-ascii
[lkundrak@trurl ~]$ file --mime-encoding /etc/passwd
/etc/passwd: binary
[lkundrak@trurl ~]$ 

Looking at lines 259-278 of ./src/ascmagic.c this is no surprise, yet I think this lacks much sense. Even though charset=us-ascii is not considered encoding, but part of content type, file added it when I specified --mime-encoding option. Contrary what would one expect, --mime-encoding does not output the charset alone, but (correct?) "binary" encoding (which is hardcoded in source code).

[lkundrak@trurl ~]$ file --mime-type /bin/ls
/bin/ls: application/x-executable
[lkundrak@trurl ~]$ file --mime-type --mime-encoding /bin/ls
/bin/ls: application/x-executable
[lkundrak@trurl ~]$ file --mime-encoding /bin/ls
/bin/ls: application/x-executable
[lkundrak@trurl ~]$ 

For non-text (binary) files the situation seems different -- the option does not make any difference, and never seems to yield results which could have been considered correct.

Any thoughts on this?
Comment 1 Daniel Novotny 2008-10-15 06:14:46 EDT
yes, acknowledged, the --mime-encoding option does not do anything useful right now: to output "binary" to all cases does not help the user at all and the other values like "base64" or "8bit" do not appear anywhere in the code

the question is how deep you want to go: to distinguish "8bit" from "binary", or "7bit" from "base64" can be quite a deal... I am analyzing possible ways to go...

other thing: the -i option is the same as turning both --mime-type and --mime-encoding together: most people use -i, even in scripts, so it can be a bad idea to break the output of this: the best way will be to treat --mime-encoding differently, separately. what do you think? I can also ask upstream...
Comment 2 Daniel Novotny 2008-10-16 05:55:54 EDT
Created attachment 320535 [details]
a patch for --mime-encoding

this patch distinguishs between "7bit" and "binary", when you run the --mime-encoding cmdline option on its own
Comment 3 Daniel Novotny 2008-10-16 05:59:14 EDT
Comment on attachment 320535 [details]
a patch for --mime-encoding

oops, I was too quick, it segfaults with directories and such (fsmagic) ... will look into this, but the overall direction is right
Comment 4 Daniel Novotny 2008-10-16 06:29:03 EDT
Comment on attachment 320535 [details]
a patch for --mime-encoding

oops#2 the patch is right, the segfault occurs also on vanilla file 4.26 without it, I will create additional bz item for this
Comment 5 Daniel Novotny 2008-10-16 07:04:29 EDT
file-4.26-3.fc10 in rawhide now

Note You need to log in before you can comment on or make changes to this bug.