Bug 444342
Summary: | sealert: Input is not proper UTF-8, indicate encoding | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Robert Scheck <redhat-bugzilla> | ||||
Component: | setroubleshoot | Assignee: | Daniel Walsh <dwalsh> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 11 | ||||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-04-08 14:39:21 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 235705 | ||||||
Attachments: |
|
Description
Robert Scheck
2008-04-27 14:33:51 UTC
Apr 27 16:30:53 tux setroubleshoot: [rpc.ERROR] exception parserError: xmlParseDoc() failed#012Traceback (most recent call last):#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 940, in handle_client_io#012 self.receiver.feed(data)#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 762, in feed#012 self.process()#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 754, in process#012 self.dispatchFunc(self.header, self.body)#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 972, in default_request_handler#012 self.handle_return(type, rpc_id, body)#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 958, in handle_return#012 interface, method, args = convert_rpc_xml_to_args(body)#012 File "/usr/lib/python2.5/site-packages/setroubleshoot/rpc.py", line 143, in convert_rpc_xml_to_args#012 doc = libxml2.parseDoc(cmd)#012 File "/usr/lib/python2.5/site-packages/libxml2.py", line 1263, in parseDoc#012 if ret is None:raise parserError('xmlParseDoc() failed')#012parserError: xmlParseDoc() failed Robert, would you please attach by using the "Create a New Attachment" link below the contents of /var/lib/setroubleshoot/audit_listener_database.xml. I need to see the data in that file to diagnose the problem. Thank you Robert. Created attachment 304022 [details]
/var/lib/setroubleshoot/audit_listener_database.xml
Of course, here it is. Looks like the translation is the problem.
The problem originates in the de translation of the fix_description in the plugins/catchall.py plugin with the use of umlaut o. Umlaut o should be encoded in UTF-8 as 0xC3,0xB6 (e.g. 0303,0266 octal) In my de.gmo file I have the following snippet (with the correct umlaut o): Sie k\303\266nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff But what appears in your xml database is this snippet: Sie k\366nnen ein lokales Richtlinienmodul generieren, um diesen Zugriff which is wrong. I did a brief test using libxml2 to parse the above phrase with the correct umlaut o encoding and it it failed. Somehow the 0xC3,0xB6 2 byte sequence is being converted to the single byte 0xF6 sequence. FWIW I even notice this in my emacs buffers. At this point I'm guessing the is some problem with encoding/decoding 0xC3,0xB6 umlaut o utf-8 byte sequence, but I don't have a handle on it yet. Umlaut o in ISO-8859 is 0xF6, so it appears as though at some point the UTF-8 encoding is being written as ISO-8859 not UTF-8 However any place in the code where we serialize xml we do so via: serialize(encoding=i18n_encoding) where the i18n_encoding comes from the config file and should be utf-8. Still not sure where the encode/decode problem is, but just capturing the investigation so far. All I can say is, that my system locale is de_DE@euro which is ISO-8859(-15) if this maybe helps. Your system locale should in theory not be significant because internally we force everything to utf-8. However if you used a tool that touched the database file, for example an editor, it probably would have rewritten the file in iso-8859, by any chance did you do something like that? ...I still hope, less(1) is reading only per default - if not, please open a bug report against less :) Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Ping? Ping? This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fixed in setroubleshoot-2.2.57-1.fc12 |