Bug 444342 - sealert: Input is not proper UTF-8, indicate encoding
sealert: Input is not proper UTF-8, indicate encoding
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: setroubleshoot (Show other bugs)
11
All Linux
low Severity high
: ---
: ---
Assigned To: Daniel Walsh
Fedora Extras Quality Assurance
:
Depends On:
Blocks: F9Target
  Show dependency treegraph
 
Reported: 2008-04-27 10:33 EDT by Robert Scheck
Modified: 2010-04-08 10:39 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-08 10:39:21 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
/var/lib/setroubleshoot/audit_listener_database.xml (50.26 KB, text/plain)
2008-04-28 14:27 EDT, Robert Scheck
no flags Details

  None (edit)
Description Robert Scheck 2008-04-27 10:33:51 EDT
Description of problem:
$ sealert -v -l 27829382-ddb4-42b4-a1f5-dc02c1d2754b
Entity: line 58: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xF6 0x6E 0x6E 0x65
    Sie können ein lokales Richtlinienmodul generieren, um diesen Zugriff
         ^
2008-04-27 16:30:53,017 [rpc.ERROR] exception parserError: xmlParseDoc() failed
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 940, in
handle_client_io
    self.receiver.feed(data)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 762, in feed
    self.process()
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 754, in
process
    self.dispatchFunc(self.header, self.body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 972, in
default_request_handler
    self.handle_return(type, rpc_id, body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 958, in
handle_return
    interface, method, args = convert_rpc_xml_to_args(body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 143, in
convert_rpc_xml_to_args
    doc = libxml2.parseDoc(cmd)
  File "/usr/lib/python2.5/site-packages/libxml2.py", line 1263, in parseDoc
    if ret is None:raise parserError('xmlParseDoc() failed')
parserError: xmlParseDoc() failed
failed to connect to server: xmlParseDoc() failed
$

Version-Release number of selected component (if applicable):
setroubleshoot-2.0.6-1
setroubleshoot-plugins-2.0.4-5

How reproducible:
Everytime (I've LANG=de_DE@euro)

Actual results:
sealert: Input is not proper UTF-8, indicate encoding

Expected results:
Just working... ;-)
Comment 1 Robert Scheck 2008-04-27 10:42:44 EDT
Apr 27 16:30:53 tux setroubleshoot: [rpc.ERROR] exception parserError:
xmlParseDoc() failed#012Traceback (most recent call last):#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 940, in
handle_client_io#012    self.receiver.feed(data)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 762, in feed#012
   self.process()#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 754, in
process#012    self.dispatchFunc(self.header, self.body)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 972, in
default_request_handler#012    self.handle_return(type, rpc_id, body)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 958, in
handle_return#012    interface, method, args = convert_rpc_xml_to_args(body)#012
 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 143, in
convert_rpc_xml_to_args#012    doc = libxml2.parseDoc(cmd)#012  File
"/usr/lib/python2.5/site-packages/libxml2.py", line 1263, in parseDoc#012    if
ret is None:raise parserError('xmlParseDoc() failed')#012parserError:
xmlParseDoc() failed
Comment 2 John Dennis 2008-04-28 14:13:47 EDT
Robert, would you please attach by using the "Create a New Attachment" link
below the contents of /var/lib/setroubleshoot/audit_listener_database.xml.

I need to see the data in that file to diagnose the problem. Thank you Robert.
Comment 3 Robert Scheck 2008-04-28 14:27:35 EDT
Created attachment 304022 [details]
/var/lib/setroubleshoot/audit_listener_database.xml

Of course, here it is. Looks like the translation is the problem.
Comment 4 John Dennis 2008-04-28 18:15:45 EDT
The problem originates in the de translation of the fix_description in the
plugins/catchall.py plugin with the use of umlaut o. Umlaut o should be encoded
in UTF-8 as 0xC3,0xB6 (e.g. 0303,0266 octal)

In my de.gmo file I have the following snippet (with the correct umlaut o):

    Sie k\303\266nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff

But what appears in your xml database is this snippet:

    Sie k\366nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff

which is wrong. I did a brief test using libxml2 to parse the above phrase with
the correct umlaut o encoding and it it failed. Somehow the 0xC3,0xB6 2 byte
sequence is being converted to the single byte 0xF6 sequence. FWIW I even notice
this in my emacs buffers. At this point I'm guessing the is some problem with
encoding/decoding 0xC3,0xB6 umlaut o utf-8 byte sequence, but I don't have a
handle on it yet.
Comment 5 John Dennis 2008-04-28 18:36:14 EDT
Umlaut o in ISO-8859 is 0xF6, so it appears as though at some point the UTF-8
encoding is being written as ISO-8859 not UTF-8

However any place in the code where we serialize xml we do so via:

serialize(encoding=i18n_encoding)

where the i18n_encoding comes from the config file and should be utf-8.

Still not sure where the encode/decode problem is, but just capturing the
investigation so far.
Comment 6 Robert Scheck 2008-04-29 05:00:06 EDT
All I can say is, that my system locale is de_DE@euro which is ISO-8859(-15)
if this maybe helps.
Comment 7 John Dennis 2008-04-29 09:10:41 EDT
Your system locale should in theory not be significant because internally we
force everything to utf-8. However if you used a tool that touched the database
file, for example an editor, it probably would have rewritten the file in
iso-8859, by any chance did you do something like that?
Comment 8 Robert Scheck 2008-04-29 11:35:38 EDT
...I still hope, less(1) is reading only per default - if not, please open a 
bug report against less :)
Comment 9 Bug Zapper 2008-05-14 06:15:27 EDT
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 10 Robert Scheck 2008-05-17 15:01:59 EDT
Ping?
Comment 11 Robert Scheck 2008-07-27 10:43:15 EDT
Ping?
Comment 12 Bug Zapper 2008-11-25 21:14:16 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 13 Bug Zapper 2009-06-09 05:33:18 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 15 Daniel Walsh 2010-01-19 16:04:49 EST
Fixed in setroubleshoot-2.2.57-1.fc12

Note You need to log in before you can comment on or make changes to this bug.