Bug 2016204
Summary: | traceback if there is non-utf8 character in the /proc/PID/cmdline | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Jaroslav Škarvada <jskarvad> | ||||||||||
Component: | python-linux-procfs | Assignee: | John Kacur <jkacur> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Qiao Zhao <qzhao> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 8.6 | CC: | bhu, jkacur, mhou, qzhao, rt-maint | ||||||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | python-linux-procfs-0.6.3-4.el8 | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | |||||||||||||
: | 2022530 2031717 (view as bug list) | Environment: | |||||||||||
Last Closed: | 2022-05-10 15:24:48 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 2022530, 2031717 | ||||||||||||
Attachments: |
|
Description
Jaroslav Škarvada
2021-10-20 23:08:14 UTC
Created attachment 1835346 [details]
Helper script
Created attachment 1835347 [details]
Main reproducer to run
Created attachment 1835348 [details]
Python script called by the main reproducer
I'm not able to reproduce the problem with those scripts, is there something missing? One potential solution is this, but I want to play with it a bit before I decide. diff --git a/procfs/procfs.py b/procfs/procfs.py index 3b7474cccb01..a297058d98dd 100755 --- a/procfs/procfs.py +++ b/procfs/procfs.py @@ -357,8 +357,9 @@ class process: return hasattr(self, attr) def load_cmdline(self): - f = open("/proc/%d/cmdline" % self.pid) - self.cmdline = f.readline().strip().split('\0')[:-1] + f = open("/proc/%d/cmdline" % self.pid, mode='rb') + cmdline = f.readline().decode(encoding='unicode_escape') + self.cmdline = cmdline.strip().split('\0')[:-1] f.close() def load_threads(self): *** Bug 2028367 has been marked as a duplicate of this bug. *** There is another bug 2028367. I am not sure whether it was reproduced with the fixed version of the python-linux-procfs (I asked reporter to specify the python-linux-procfs version he used), but the traceback is the following now: 2021-12-01 11:42:58,660 ERROR tuned.units.manager: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 87, in unapply_tuning self._plugin.instance_unapply_tuning(self, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, in instance_unapply_tuning self._instance_unapply_static(instance, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 720, in _instance_unapply_static self._restore_ps_affinity() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 676, in _restore_ps_affinity ps = self.get_processes() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 291, in get_processes cmd = self._get_cmdline(proc) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 279, in _get_cmdline cmdline = procfs.process_cmdline(process) File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in process_cmdline if pid_info["cmdline"]: File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in __getitem__ self.load_cmdline() File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in load_cmdline cmdline = f.readline().decode(encoding='unicode_escape') UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX escape I.e. there is now: cmdline = f.readline().decode(encoding='unicode_escape') you mentioned in the comment 7. (In reply to Jaroslav Škarvada from comment #14) > There is another bug 2028367. I am not sure whether it was reproduced with > the fixed version of the python-linux-procfs (I asked reporter to specify > the python-linux-procfs version he used), but the traceback is the following > now: > > 2021-12-01 11:42:58,660 ERROR tuned.units.manager: Traceback (most recent > call last): > File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, > in _try_call > return f(*args, **kwargs) > File > "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line > 87, in unapply_tuning > self._plugin.instance_unapply_tuning(self, full_rollback) > File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, > in instance_unapply_tuning > self._instance_unapply_static(instance, full_rollback) > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 720, in _instance_unapply_static > self._restore_ps_affinity() > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 676, in _restore_ps_affinity > ps = self.get_processes() > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 291, in get_processes > cmd = self._get_cmdline(proc) > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 279, in _get_cmdline > cmdline = procfs.process_cmdline(process) > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in > process_cmdline > if pid_info["cmdline"]: > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in > __getitem__ > self.load_cmdline() > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in > load_cmdline > cmdline = f.readline().decode(encoding='unicode_escape') > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position > 22-23: truncated \uXXXX escape > > I.e. there is now: > cmdline = f.readline().decode(encoding='unicode_escape') > you mentioned in the comment 7. It was reproduced with the python3-linux-procfs-0.6.3-3.el8, so your fix is probably not OK, moving to assigned. Created attachment 1845048 [details]
Helper to reproduce the problem with the python3-linux-procfs-0.6.3-3
This bug affects RT kenrel test on CTC1. I consider escalating this bug priority. Hello folks I find this bug block tuning on rhel 8.6. Here is my test environment python3-linux-procfs-0.6.3-3.el8.noarch # uname -r 4.18.0-353.rt7.138.el8.x86_64 # cat > /etc/tuned/realtime-virtual-host-variables.conf <<-EOF isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 isolate_managed_irq=Y EOF # cat /etc/tuned/realtime-virtual-host-variables.conf isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30 isolate_managed_irq=Y tuned failed log: 2021-12-08 07:51:04,487 ERROR tuned.units.manager: BUG: Unhandled exception in stop_tuning: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX escape 2021-12-08 07:51:04,487 ERROR tuned.units.manager: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 87, in unapply_tuning self._plugin.instance_unapply_tuning(self, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, in instance_unapply_tuning self._instance_unapply_static(instance, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 720, in _instance_unapply_static self._restore_ps_affinity() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 676, in _restore_ps_affinity ps = self.get_processes() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 291, in get_processes cmd = self._get_cmdline(proc) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 279, in _get_cmdline cmdline = procfs.process_cmdline(process) File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in process_cmdline if pid_info["cmdline"]: File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in __getitem__ self.load_cmdline() File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in load_cmdline cmdline = f.readline().decode(encoding='unicode_escape') UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX escape another tool script also have this issue. I have uploaded dpdk_nic_bind.py on attachment. ./dpdk_nic_bind.py -u 0000:5e:00.0 Traceback (most recent call last): File "./dpdk_nic_bind.py", line 1048, in <module> main() File "./dpdk_nic_bind.py", line 1042, in main do_arg_actions() File "./dpdk_nic_bind.py", line 1025, in do_arg_actions unbind_all(args, force_flag) File "./dpdk_nic_bind.py", line 713, in unbind_all unbind_one(d, force) File "./dpdk_nic_bind.py", line 599, in unbind_one (dev[b"Slot"], dev[b"Device_str"], dev[b"Interface"])) KeyError: b'Slot' # dpdk-devbind.py --status Network devices using kernel driver =================================== 0000:18:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 unused=igb_uio,vfio-pci *Active* 0000:18:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 unused=igb_uio,vfio-pci 0000:19:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno3 drv=tg3 unused=igb_uio,vfio-pci 0000:19:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno4 drv=tg3 unused=igb_uio,vfio-pci 0000:3b:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f0 drv=i40e unused=igb_uio,vfio-pci 0000:3b:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f1 drv=i40e unused=igb_uio,vfio-pci 0000:5e:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f0 drv=i40e unused=igb_uio,vfio-pci 0000:5e:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f1 drv=i40e unused=igb_uio,vfio-pci (In reply to mhou from comment #24) > another tool script also have this issue. I have uploaded dpdk_nic_bind.py > on attachment. > > ./dpdk_nic_bind.py -u 0000:5e:00.0 > Traceback (most recent call last): > File "./dpdk_nic_bind.py", line 1048, in <module> > main() > File "./dpdk_nic_bind.py", line 1042, in main > do_arg_actions() > File "./dpdk_nic_bind.py", line 1025, in do_arg_actions > unbind_all(args, force_flag) > File "./dpdk_nic_bind.py", line 713, in unbind_all > unbind_one(d, force) > File "./dpdk_nic_bind.py", line 599, in unbind_one > (dev[b"Slot"], dev[b"Device_str"], dev[b"Interface"])) > KeyError: b'Slot' > > # dpdk-devbind.py --status > > Network devices using kernel driver > =================================== > 0000:18:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 > unused=igb_uio,vfio-pci *Active* > 0000:18:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 > unused=igb_uio,vfio-pci > 0000:19:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno3 drv=tg3 > unused=igb_uio,vfio-pci > 0000:19:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno4 drv=tg3 > unused=igb_uio,vfio-pci > 0000:3b:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f0 > drv=i40e unused=igb_uio,vfio-pci > 0000:3b:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f1 > drv=i40e unused=igb_uio,vfio-pci > 0000:5e:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f0 > drv=i40e unused=igb_uio,vfio-pci > 0000:5e:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f1 > drv=i40e unused=igb_uio,vfio-pci Not exactly a standalone script. In any case the original "fix" actually increased the problem, but the original problem is hard to trigger. I have a patch that will "fix" this. Commit 7570fc0d6082 meant to solve the UnicodeDecodeError Instead it actually increased the problem by reading lines as bytes and decoding them. The original problem is hard to trigger and doesn't trigger consistently with reproducers. In addition there seems to be a difference in how this is handled between python-3.6 to python-3.9 For now, we should return the code to reading as utf-8 (the default) since that handles more cases than the current code. We can catch the UnicodeDecodeError and ignore it for now. It is not ideal because we are not handling some pids that trigger the error. This patch also includes a fix for a FileNotFoundError which can occur if a pid exits and disappears before we try to read it in the /proc file system. We are seeing this traceback also on the RHEL-9 now (bug 2031645), cloning to RHEL-9. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (python-linux-procfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2064 |