Bug 577365 - python-linux-procfs: python traceback while monitoring system
Summary: python-linux-procfs: python traceback while monitoring system
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-utilities
Version: 1.2
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
: ---
Assignee: Arnaldo Carvalho de Melo
QA Contact: David Sommerseth
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-03-26 19:11 UTC by Clark Williams
Modified: 2016-05-22 23:30 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
On systems with a large number of CPUs (24 and more), Tuna may have attempted to read procfs data for a terminated process and terminate unexpectedly. With this update, Tuna has been modified to catch an exception and remove the terminated process from its data structures.
Clone Of:
Environment:
Last Closed: 2010-10-11 15:10:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0762 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Realtime bug fix and enhancement update 2010-10-11 15:10:12 UTC

Description Clark Williams 2010-03-26 19:11:46 UTC
Description of problem:

Running tuna on 24-core AMD system and encountered the following python traceback:

# tuna
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/tuna/tuna_gui.py", line 93, in refresh
    self.irqview.refresh()
  File "/usr/lib/python2.4/site-packages/tuna/gui/irqview.py", line 259, in refresh
    self.show()
  File "/usr/lib/python2.4/site-packages/tuna/gui/irqview.py", line 240, in show
    self.set_irq_columns(row, irq, irq_info, nics)
  File "/usr/lib/python2.4/site-packages/tuna/gui/irqview.py", line 188, in set_irq_columns
    pids = self.ps.find_by_regex(irq_re)
  File "/usr/lib/python2.4/site-packages/procfs/procfs.py", line 229, in find_by_regex
    if regex.match(self.processes[pid]["stat"]["comm"]):
  File "/usr/lib/python2.4/site-packages/procfs/procfs.py", line 147, in __getitem__
    setattr(self, attr, sclass(self.pid, self.basedir))
  File "/usr/lib/python2.4/site-packages/procfs/procfs.py", line 60, in __init__
    self.load(basedir)
  File "/usr/lib/python2.4/site-packages/procfs/procfs.py", line 72, in load
    f = open("%s/%d/stat" % (basedir, self.pid))
IOError: [Errno 2] No such file or directory: '/proc/9461/stat'


Version-Release number of selected component (if applicable):

# rpm -q tuna
tuna-0.9.2-1.el5rt


How reproducible:

random

Steps to Reproduce:
1. run tuna on amd-istanbul-24.farm.hsv.redhat.com
2. let run for >10 minutes
3. hope to see traceback


Additional info:

Probably just another spot that needs to be guarded by try/except to catch process disappearance.

Comment 1 Arnaldo Carvalho de Melo 2010-03-26 19:32:44 UTC
Problem is in python-linux-procfs, commited a fix upstream and will provide a package to test on this machine.

Comment 2 Arnaldo Carvalho de Melo 2010-05-10 20:04:36 UTC
Tested with the istambul machine, couldn't reproduce. Also tested localy with a machine with 10 Gbit/s cards, brew build at:

https://brewweb.devel.redhat.com/taskinfo?taskID=2433122

Will tag after some more testing.

Comment 3 Clark Williams 2010-10-04 19:30:42 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
* Cause: 
* Consequence: Python backtrace
* Fix: 
* Result: Works properly on large (>=24 core) cpu systems

Comment 5 Arnaldo Carvalho de Melo 2010-10-05 15:33:48 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1,4 @@
-* Cause: 
+* Cause: Processes can terminate while its procfs data is being read
 * Consequence: Python backtrace
-* Fix: 
+* Fix: Catch exception and remove dead process from data structures
 * Result: Works properly on large (>=24 core) cpu systems

Comment 6 David Sommerseth 2010-10-07 14:32:25 UTC
Tried running tuna-0.9.2-1 and python-linux-procfs-0.4.2-1 on a 32 cores box for 30 minutes without triggering this bug.

Ran tuna-0.9.4-1 and python-linux-procfs-0.4.5-1 for over 1 hour without any issues.  As it seems to work reliable -> moving to VERIFIED.

Comment 7 Jaromir Hradilek 2010-10-11 14:20:04 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-* Cause: Processes can terminate while its procfs data is being read
+On systems with a large number of CPUs (24 and more), Tuna may have attempted to read procfs data for a terminated process and terminate unexpectedly. With this update, Tuna has been modified to catch an exception and remove the terminated process from its data structures.-* Consequence: Python backtrace
-* Fix: Catch exception and remove dead process from data structures
-* Result: Works properly on large (>=24 core) cpu systems

Comment 8 errata-xmlrpc 2010-10-11 15:10:29 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0762.html


Note You need to log in before you can comment on or make changes to this bug.