Bug 444342

Summary: sealert: Input is not proper UTF-8, indicate encoding
Product: [Fedora] Fedora Reporter: Robert Scheck <redhat-bugzilla>
Component: setroubleshootAssignee: Daniel Walsh <dwalsh>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-04-08 14:39:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235705    
Attachments:
Description Flags
/var/lib/setroubleshoot/audit_listener_database.xml none

Description Robert Scheck 2008-04-27 14:33:51 UTC
Description of problem:
$ sealert -v -l 27829382-ddb4-42b4-a1f5-dc02c1d2754b
Entity: line 58: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xF6 0x6E 0x6E 0x65
    Sie können ein lokales Richtlinienmodul generieren, um diesen Zugriff
         ^
2008-04-27 16:30:53,017 [rpc.ERROR] exception parserError: xmlParseDoc() failed
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 940, in
handle_client_io
    self.receiver.feed(data)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 762, in feed
    self.process()
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 754, in
process
    self.dispatchFunc(self.header, self.body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 972, in
default_request_handler
    self.handle_return(type, rpc_id, body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 958, in
handle_return
    interface, method, args = convert_rpc_xml_to_args(body)
  File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 143, in
convert_rpc_xml_to_args
    doc = libxml2.parseDoc(cmd)
  File "/usr/lib/python2.5/site-packages/libxml2.py", line 1263, in parseDoc
    if ret is None:raise parserError('xmlParseDoc() failed')
parserError: xmlParseDoc() failed
failed to connect to server: xmlParseDoc() failed
$

Version-Release number of selected component (if applicable):
setroubleshoot-2.0.6-1
setroubleshoot-plugins-2.0.4-5

How reproducible:
Everytime (I've LANG=de_DE@euro)

Actual results:
sealert: Input is not proper UTF-8, indicate encoding

Expected results:
Just working... ;-)

Comment 1 Robert Scheck 2008-04-27 14:42:44 UTC
Apr 27 16:30:53 tux setroubleshoot: [rpc.ERROR] exception parserError:
xmlParseDoc() failed#012Traceback (most recent call last):#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 940, in
handle_client_io#012    self.receiver.feed(data)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 762, in feed#012
   self.process()#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 754, in
process#012    self.dispatchFunc(self.header, self.body)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 972, in
default_request_handler#012    self.handle_return(type, rpc_id, body)#012  File
"/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 958, in
handle_return#012    interface, method, args = convert_rpc_xml_to_args(body)#012
 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 143, in
convert_rpc_xml_to_args#012    doc = libxml2.parseDoc(cmd)#012  File
"/usr/lib/python2.5/site-packages/libxml2.py", line 1263, in parseDoc#012    if
ret is None:raise parserError('xmlParseDoc() failed')#012parserError:
xmlParseDoc() failed

Comment 2 John Dennis 2008-04-28 18:13:47 UTC
Robert, would you please attach by using the "Create a New Attachment" link
below the contents of /var/lib/setroubleshoot/audit_listener_database.xml.

I need to see the data in that file to diagnose the problem. Thank you Robert.

Comment 3 Robert Scheck 2008-04-28 18:27:35 UTC
Created attachment 304022 [details]
/var/lib/setroubleshoot/audit_listener_database.xml

Of course, here it is. Looks like the translation is the problem.

Comment 4 John Dennis 2008-04-28 22:15:45 UTC
The problem originates in the de translation of the fix_description in the
plugins/catchall.py plugin with the use of umlaut o. Umlaut o should be encoded
in UTF-8 as 0xC3,0xB6 (e.g. 0303,0266 octal)

In my de.gmo file I have the following snippet (with the correct umlaut o):

    Sie k\303\266nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff

But what appears in your xml database is this snippet:

    Sie k\366nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff

which is wrong. I did a brief test using libxml2 to parse the above phrase with
the correct umlaut o encoding and it it failed. Somehow the 0xC3,0xB6 2 byte
sequence is being converted to the single byte 0xF6 sequence. FWIW I even notice
this in my emacs buffers. At this point I'm guessing the is some problem with
encoding/decoding 0xC3,0xB6 umlaut o utf-8 byte sequence, but I don't have a
handle on it yet.

Comment 5 John Dennis 2008-04-28 22:36:14 UTC
Umlaut o in ISO-8859 is 0xF6, so it appears as though at some point the UTF-8
encoding is being written as ISO-8859 not UTF-8

However any place in the code where we serialize xml we do so via:

serialize(encoding=i18n_encoding)

where the i18n_encoding comes from the config file and should be utf-8.

Still not sure where the encode/decode problem is, but just capturing the
investigation so far.


Comment 6 Robert Scheck 2008-04-29 09:00:06 UTC
All I can say is, that my system locale is de_DE@euro which is ISO-8859(-15)
if this maybe helps.

Comment 7 John Dennis 2008-04-29 13:10:41 UTC
Your system locale should in theory not be significant because internally we
force everything to utf-8. However if you used a tool that touched the database
file, for example an editor, it probably would have rewritten the file in
iso-8859, by any chance did you do something like that?

Comment 8 Robert Scheck 2008-04-29 15:35:38 UTC
...I still hope, less(1) is reading only per default - if not, please open a 
bug report against less :)

Comment 9 Bug Zapper 2008-05-14 10:15:27 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Robert Scheck 2008-05-17 19:01:59 UTC
Ping?

Comment 11 Robert Scheck 2008-07-27 14:43:15 UTC
Ping?

Comment 12 Bug Zapper 2008-11-26 02:14:16 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 13 Bug Zapper 2009-06-09 09:33:18 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Daniel Walsh 2010-01-19 21:04:49 UTC
Fixed in setroubleshoot-2.2.57-1.fc12