Bug 2016204
| Summary: | traceback if there is non-utf8 character in the /proc/PID/cmdline | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Jaroslav Škarvada <jskarvad> | ||||||||||
| Component: | python-linux-procfs | Assignee: | John Kacur <jkacur> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Qiao Zhao <qzhao> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | high | ||||||||||||
| Version: | 8.6 | CC: | bhu, jkacur, mhou, qzhao, rt-maint | ||||||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||||||||
| Hardware: | x86_64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | python-linux-procfs-0.6.3-4.el8 | Doc Type: | If docs needed, set a value | ||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | |||||||||||||
| : | 2022530 2031717 (view as bug list) | Environment: | |||||||||||
| Last Closed: | 2022-05-10 15:24:48 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | |||||||||||||
| Bug Blocks: | 2022530, 2031717 | ||||||||||||
| Attachments: |
|
||||||||||||
Created attachment 1835346 [details]
Helper script
Created attachment 1835347 [details]
Main reproducer to run
Created attachment 1835348 [details]
Python script called by the main reproducer
I'm not able to reproduce the problem with those scripts, is there something missing? One potential solution is this, but I want to play with it a bit before I decide.
diff --git a/procfs/procfs.py b/procfs/procfs.py
index 3b7474cccb01..a297058d98dd 100755
--- a/procfs/procfs.py
+++ b/procfs/procfs.py
@@ -357,8 +357,9 @@ class process:
return hasattr(self, attr)
def load_cmdline(self):
- f = open("/proc/%d/cmdline" % self.pid)
- self.cmdline = f.readline().strip().split('\0')[:-1]
+ f = open("/proc/%d/cmdline" % self.pid, mode='rb')
+ cmdline = f.readline().decode(encoding='unicode_escape')
+ self.cmdline = cmdline.strip().split('\0')[:-1]
f.close()
def load_threads(self):
*** Bug 2028367 has been marked as a duplicate of this bug. *** There is another bug 2028367. I am not sure whether it was reproduced with the fixed version of the python-linux-procfs (I asked reporter to specify the python-linux-procfs version he used), but the traceback is the following now: 2021-12-01 11:42:58,660 ERROR tuned.units.manager: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 87, in unapply_tuning self._plugin.instance_unapply_tuning(self, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, in instance_unapply_tuning self._instance_unapply_static(instance, full_rollback) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 720, in _instance_unapply_static self._restore_ps_affinity() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 676, in _restore_ps_affinity ps = self.get_processes() File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 291, in get_processes cmd = self._get_cmdline(proc) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 279, in _get_cmdline cmdline = procfs.process_cmdline(process) File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in process_cmdline if pid_info["cmdline"]: File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in __getitem__ self.load_cmdline() File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in load_cmdline cmdline = f.readline().decode(encoding='unicode_escape') UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX escape I.e. there is now: cmdline = f.readline().decode(encoding='unicode_escape') you mentioned in the comment 7. (In reply to Jaroslav Škarvada from comment #14) > There is another bug 2028367. I am not sure whether it was reproduced with > the fixed version of the python-linux-procfs (I asked reporter to specify > the python-linux-procfs version he used), but the traceback is the following > now: > > 2021-12-01 11:42:58,660 ERROR tuned.units.manager: Traceback (most recent > call last): > File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, > in _try_call > return f(*args, **kwargs) > File > "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line > 87, in unapply_tuning > self._plugin.instance_unapply_tuning(self, full_rollback) > File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, > in instance_unapply_tuning > self._instance_unapply_static(instance, full_rollback) > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 720, in _instance_unapply_static > self._restore_ps_affinity() > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 676, in _restore_ps_affinity > ps = self.get_processes() > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 291, in get_processes > cmd = self._get_cmdline(proc) > File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", > line 279, in _get_cmdline > cmdline = procfs.process_cmdline(process) > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in > process_cmdline > if pid_info["cmdline"]: > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in > __getitem__ > self.load_cmdline() > File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in > load_cmdline > cmdline = f.readline().decode(encoding='unicode_escape') > UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position > 22-23: truncated \uXXXX escape > > I.e. there is now: > cmdline = f.readline().decode(encoding='unicode_escape') > you mentioned in the comment 7. It was reproduced with the python3-linux-procfs-0.6.3-3.el8, so your fix is probably not OK, moving to assigned. Created attachment 1845048 [details]
Helper to reproduce the problem with the python3-linux-procfs-0.6.3-3
This bug affects RT kenrel test on CTC1. I consider escalating this bug priority. Hello folks
I find this bug block tuning on rhel 8.6. Here is my test environment
python3-linux-procfs-0.6.3-3.el8.noarch
# uname -r
4.18.0-353.rt7.138.el8.x86_64
# cat > /etc/tuned/realtime-virtual-host-variables.conf <<-EOF
isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
isolate_managed_irq=Y
EOF
# cat /etc/tuned/realtime-virtual-host-variables.conf
isolated_cores=2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
isolate_managed_irq=Y
tuned failed log:
2021-12-08 07:51:04,487 ERROR tuned.units.manager: BUG: Unhandled exception in stop_tuning: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX
escape
2021-12-08 07:51:04,487 ERROR tuned.units.manager: Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call
return f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 87, in unapply_tuning
self._plugin.instance_unapply_tuning(self, full_rollback)
File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 312, in instance_unapply_tuning
self._instance_unapply_static(instance, full_rollback)
File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 720, in _instance_unapply_static
self._restore_ps_affinity()
File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 676, in _restore_ps_affinity
ps = self.get_processes()
File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 291, in get_processes
cmd = self._get_cmdline(proc)
File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_scheduler.py", line 279, in _get_cmdline
cmdline = procfs.process_cmdline(process)
File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in process_cmdline
if pid_info["cmdline"]:
File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 356, in __getitem__
self.load_cmdline()
File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 374, in load_cmdline
cmdline = f.readline().decode(encoding='unicode_escape')
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 22-23: truncated \uXXXX escape
another tool script also have this issue. I have uploaded dpdk_nic_bind.py on attachment.
./dpdk_nic_bind.py -u 0000:5e:00.0
Traceback (most recent call last):
File "./dpdk_nic_bind.py", line 1048, in <module>
main()
File "./dpdk_nic_bind.py", line 1042, in main
do_arg_actions()
File "./dpdk_nic_bind.py", line 1025, in do_arg_actions
unbind_all(args, force_flag)
File "./dpdk_nic_bind.py", line 713, in unbind_all
unbind_one(d, force)
File "./dpdk_nic_bind.py", line 599, in unbind_one
(dev[b"Slot"], dev[b"Device_str"], dev[b"Interface"]))
KeyError: b'Slot'
# dpdk-devbind.py --status
Network devices using kernel driver
===================================
0000:18:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 unused=igb_uio,vfio-pci *Active*
0000:18:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 unused=igb_uio,vfio-pci
0000:19:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno3 drv=tg3 unused=igb_uio,vfio-pci
0000:19:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno4 drv=tg3 unused=igb_uio,vfio-pci
0000:3b:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f0 drv=i40e unused=igb_uio,vfio-pci
0000:3b:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f1 drv=i40e unused=igb_uio,vfio-pci
0000:5e:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f0 drv=i40e unused=igb_uio,vfio-pci
0000:5e:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f1 drv=i40e unused=igb_uio,vfio-pci
(In reply to mhou from comment #24) > another tool script also have this issue. I have uploaded dpdk_nic_bind.py > on attachment. > > ./dpdk_nic_bind.py -u 0000:5e:00.0 > Traceback (most recent call last): > File "./dpdk_nic_bind.py", line 1048, in <module> > main() > File "./dpdk_nic_bind.py", line 1042, in main > do_arg_actions() > File "./dpdk_nic_bind.py", line 1025, in do_arg_actions > unbind_all(args, force_flag) > File "./dpdk_nic_bind.py", line 713, in unbind_all > unbind_one(d, force) > File "./dpdk_nic_bind.py", line 599, in unbind_one > (dev[b"Slot"], dev[b"Device_str"], dev[b"Interface"])) > KeyError: b'Slot' > > # dpdk-devbind.py --status > > Network devices using kernel driver > =================================== > 0000:18:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 > unused=igb_uio,vfio-pci *Active* > 0000:18:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 > unused=igb_uio,vfio-pci > 0000:19:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno3 drv=tg3 > unused=igb_uio,vfio-pci > 0000:19:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno4 drv=tg3 > unused=igb_uio,vfio-pci > 0000:3b:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f0 > drv=i40e unused=igb_uio,vfio-pci > 0000:3b:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens1f1 > drv=i40e unused=igb_uio,vfio-pci > 0000:5e:00.0 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f0 > drv=i40e unused=igb_uio,vfio-pci > 0000:5e:00.1 'Ethernet Controller XXV710 for 25GbE SFP28 158b' if=ens3f1 > drv=i40e unused=igb_uio,vfio-pci Not exactly a standalone script. In any case the original "fix" actually increased the problem, but the original problem is hard to trigger. I have a patch that will "fix" this. Commit 7570fc0d6082 meant to solve the UnicodeDecodeError Instead it actually increased the problem by reading lines as bytes and decoding them. The original problem is hard to trigger and doesn't trigger consistently with reproducers. In addition there seems to be a difference in how this is handled between python-3.6 to python-3.9 For now, we should return the code to reading as utf-8 (the default) since that handles more cases than the current code. We can catch the UnicodeDecodeError and ignore it for now. It is not ideal because we are not handling some pids that trigger the error. This patch also includes a fix for a FileNotFoundError which can occur if a pid exits and disappears before we try to read it in the /proc file system. We are seeing this traceback also on the RHEL-9 now (bug 2031645), cloning to RHEL-9. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (python-linux-procfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2064 |
Description of problem: Processes can set arbitrary characters to their cmdline and procfs will traceback in case there is non-utf8 character in the cmdline. Version-Release number of selected component (if applicable): python3-linux-procfs-0.6.3-1.el8 How reproducible: Always Steps to Reproduce: 1. Copy three attached files (repr, test.py, helper) to the target system 2. ./repr Actual results: raceback (most recent call last): File "./test.py", line 6, in <module> cmdline = procfs.process_cmdline(procfs.process(int(sys.argv[1]))) File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 43, in process_cmdline if pid_info["cmdline"]: File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 343, in __getitem__ self.load_cmdline() File "/usr/lib/python3.6/site-packages/procfs/procfs.py", line 361, in load_cmdline self.cmdline = f.readline().strip().split('\0')[:-1] File "/usr/lib64/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte Expected results: No traceback Additional info: We use procfs in TuneD and we are seeing this tracebacks in customer reports. The procfs should cope with arbitrary characters in the cmdline and not traceback, e.g. instead of utf strings it could handle it as byte array.