Bug 1573555
Summary: | _util.py:67:ensure_unicode_string:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Torgeir Veimo <torgeir> | ||||||||||||||||
Component: | python-blivet | Assignee: | Blivet Maintenance Team <blivet-maint-list> | ||||||||||||||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||||
Priority: | unspecified | ||||||||||||||||||
Version: | 28 | CC: | amulhern, anaconda-maint-list, apevec, blivet-maint-list, clockfor, dshea, jkonecny, jonathan, jskarvad, junli, kellin, mkolman, rvykydal, sbueno, torgeir, vanmeeuwen+fedora, v.podzimek+fedora, vponcova, vtrefny, wwoods | ||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Whiteboard: | abrt_hash:177a60b4a5a57f84c9fd51f8e2b741ba07ba8de1;VARIANT_ID=workstation; | ||||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2019-05-28 22:58:19 UTC | Type: | --- | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Description
Torgeir Veimo
2018-05-01 16:54:48 UTC
Created attachment 1429371 [details]
File: cgroup
Created attachment 1429372 [details]
File: cpuinfo
Created attachment 1429373 [details]
File: environ
Created attachment 1429374 [details]
File: mountinfo
Created attachment 1429375 [details]
File: namespaces
Created attachment 1429376 [details]
File: open_fds
Created attachment 1429377 [details]
File: os_info
I can provide remote login to this machine if that helps. There are windows 10 and mac partitions on this computer (dell 9010 sff) as well, not sure if it's relevant. Same thing happens in F27, but then the error comes in the console itself. [root@hackintosh ~]# anaconda --loglevel debug Starting installer, one moment... anaconda 27.20.4-1 for anaconda bluesky (pre-release) started. * installation log files are stored in /tmp during the installation * shell is available on TTY2 and in second TMUX pane (ctrl+b, then press 2) * when reporting a bug add logs from /tmp as separate text/plain attachments Traceback (most recent call last): File "/sbin/anaconda", line 658, in <module> matched = device_matches("LABEL=OEMDRV", disks_only=True) File "/usr/lib64/python3.6/site-packages/pyanaconda/storage_utils.py", line 897, in device_matches single_spec_matches = udev.resolve_glob(full_spec) File "/usr/lib/python3.6/site-packages/blivet/udev.py", line 155, in resolve_glob for dev in get_devices(): File "/usr/lib/python3.6/site-packages/blivet/udev.py", line 73, in get_devices dev = device_to_dict(device) File "/usr/lib/python3.6/site-packages/blivet/udev.py", line 48, in device_to_dict result = dict(device.properties) File "/usr/lib/python3.6/site-packages/pyudev/device/_device.py", line 1085, in __getitem__ return ensure_unicode_string(value) File "/usr/lib/python3.6/site-packages/pyudev/_util.py", line 67, in ensure_unicode_string value = value.decode(sys.getfilesystemencoding()) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte Not entirely convinced this is due to errors in _util.py. I added logging to that file, but there's none produced. Hi, pyudev maintainer here. I doubt this is a pyudev error, but it might be a libudev error. Would you be able to run the following script import pyudev from pyudev import Context def main(): for device in Context().list_devices(subsystem="block"): properties = device.properties names = [n for n in properties] for prop_name in names: try: value = properties.get(prop_name) except UnicodeDecodeError as err: print("device: %s" % device) print("prop name: %s" % prop_name) raise if __name__ == "__main__": main() and let me know the output? Ideally, it will locate the particular block device and property value that is for some reason failing to be converted properly. Thanks! [root@localhost-live ~]# python3 test.py device: Device('/sys/devices/pci0000:00/0000:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1') prop name: PARTNAME Traceback (most recent call last): File "test.py", line 20, in <module> main() File "test.py", line 12, in main value = properties.get(prop_name) File "/usr/lib64/python3.6/_collections_abc.py", line 660, in get return self[key] File "/usr/lib/python3.6/site-packages/pyudev/device/_device.py", line 1085, in __getitem__ return ensure_unicode_string(value) File "/usr/lib/python3.6/site-packages/pyudev/_util.py", line 67, in ensure_unicode_string value = value.decode(sys.getfilesystemencoding()) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte It looks like the string value of the property PARTNAME is simply "1". Correction, the value causing the problem seems to be b'\xc0#\\!U0!\xea!!\xdck!C\xa8\xc6!\xe7\xc5k/l?!!!\xb6]' I believe this might be a win 10 ntfs recovery partition. It does decode in "latin-1"
>>> b'\xc0#\\!U0!\xea!!\xdck!C\xa8\xc6!\xe7\xc5k/l?!!!\xb6]'.decode('latin-1')
'À#\\!U0!ê!!Ük!C¨Æ!çÅk/l?!!!¶]'
What's the best approach to have this code be resistant to such input data? Would be better if it just gave a warning and used an undecoded string so that installation can proceed? pyudev doesn't log at all. It isn't clear what it should do in this situation. If it is possible to find the "correct" encoding, it should do that. But that is unlikely to be always true. Probably the best thing to do for the installer at this time is for blivet to catch the exception at around: File "/usr/lib/python3.6/site-packages/blivet/udev.py", line 73, in get_devices dev = device_to_dict(device) and take whatever it considers to be the appropriate action for a failed construction of the property table for a particular device. OR blivet could take the step-by-step approach of the tests I wrote for constructing its dict. Then it can do whatever it wants w/ the particular property that can't be decoded and take it from there. So it looks like it might make most sense to reassign to blivet at this time. Reassigning, because I think blivet will always have to handle the possibility of pyudev decode failure, regardless of what changes may be made to pyudev. What about telling decode to handle errors some way other than by raising an exception? See https://docs.python.org/2/library/codecs.html#codec-base-classes Passing errors='replace' to decode would allow pyudev to always present valid data. I think that that is probably not a good idea. Objections are: * It would constitute a significant change in behaviour. Clients typically object to that kind of thing. * I don't think it would actually solve any problems/fix the bug. I think the root cause of the problem is that values can be set under one encoding and then read using another and that it is never known what the proper encoding for decoding really is (because devices can move in time and space and the values in udev properties and attributes are taken from many things). * New clients of pyudev would rely on and be checking values that turned out to be suprising, because they had substitute characters. Eventually, but it would take longer than with an exception, they would notice that they were not getting what they expected and that would lead to a new set of bugs being filed. Some sort of configuration parameter that allowed a client to explicitly change the behaviour in a global way might be possible, but it all seems like a long and tricky job. dshea, just wondering if you had an opinion. (In reply to mulhern from comment #22) > dshea, just wondering if you had an opinion. Do these strings need to be unique or reproducible? The problem I see with raising an exception is how is the caller supposed to handle it? You can't (or at least shouldn't) change the default encoding at runtime, so I don't see how blivet or another caller is expected to recover from the error. hi, any update or workaroud about this bug? This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |