Bug 235420 - File does not recognize a UTF-16, little-endian encoded XML file
Summary: File does not recognize a UTF-16, little-endian encoded XML file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: file
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Tomas Smetana
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-04-05 17:29 UTC by Dave Malcolm
Modified: 2009-01-20 22:02 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 22:02:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Input file (sticking to 7-bit characters) (49 bytes, text/plain)
2007-04-05 17:29 UTC, Dave Malcolm
no flags Details
output file, should be UTF-16 little-endian encoding (136 bytes, text/xml)
2007-04-05 17:30 UTC, Dave Malcolm
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0208 0 normal SHIPPED_LIVE file bug fix and enhancement update 2009-01-20 16:06:14 UTC

Description Dave Malcolm 2007-04-05 17:29:46 UTC
Description of problem:
Attached is an XML file, which I believe is a correctly encoded UTF-16,
little-endian encoded XML file (c.f. :
http://www.w3.org/TR/REC-xml/#sec-guessing )

xmllint reads it fine.

However running "file" on it gives blank output:
file output.xml
output.xml: 

Version-Release number of selected component (if applicable):
file-4.17-8

How reproducible:
100%


Steps to Reproduce:
1. Take an XML file, run "xmllint --encode UTF-16 input.xml > output.xml"
2. xmllint output.xml
3. file output.xml
  
Actual results:
xmllint reads the file fine but file gives no useful output:
file output.xml 
output.xml: 
(i.e. blank output)

Expected results:
either:
  output.xml: XML 1.0 document text
or:
  output.xml: XML 1.0 document text (UTF-16 little-endian encoding)
or somesuch

Comment 1 Dave Malcolm 2007-04-05 17:29:46 UTC
Created attachment 151788 [details]
Input file (sticking to 7-bit characters)

Comment 2 Dave Malcolm 2007-04-05 17:30:35 UTC
Created attachment 151789 [details]
output file, should be UTF-16 little-endian encoding

Comment 3 RHEL Program Management 2008-06-04 22:49:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 errata-xmlrpc 2009-01-20 22:02:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0208.html


Note You need to log in before you can comment on or make changes to this bug.