Red Hat Bugzilla – Bug 174348
file munges non-ASCII filenames in output
Last modified: 2007-11-30 17:07:21 EST
Using the provided testcase:
symbolic link to '/tmp/Â¥ÂµÃ
$ file /tmp/sln.test
/tmp/sln.test: symbolic link to
$ file -r /tmp/sln.test
/tmp/sln.test: symbolic link to `/tmp/Â¥ÂµÃ
file munges what it considers "non-printable" characters in file_getbuffer() in
Removing the for loop and returning ms->o.buf instead of ms->o.pbuf "fixes" the
file shouldn't use "isprint()" to check if a character is printable.
Created attachment 121536 [details]
Well, this solves the problem with UTF, but what about if the file had \n
embedded in it, or other terminal escape sequences? Also what if the string
did not come from a symlink, but from a %s magic? Is it really UTF then?
What about using iswctype(), after converting each mb sequence to a wchar_t,
instead of using isprint()?
Created attachment 121591 [details]
Try to parse the output buffer as a multi-byte sequence. If it fails at any
point, fall back on the old ASCII based escape.
Slightly modified patch applied on rawhide, file-4.16-4
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.