Hide Forgot
Description of problem: Hi, A systemtap module which records file system IO pattern(distribution of read,write, read seek, write seek and R/W throughput) has caused several times of kernel panics on our computing cluster. Version-Release number of selected component (if applicable): systemtap-1.6-7.el5_8.x86_64 systemtap-runtime-1.6-7.el5_8.x86_64 How reproducible: 30 kernel panics / 20000+ successful runs 30 nodes/ 900 nodes Steps to Reproduce: 1. run the script with a cron job which finds unprobed jobs on the computing node every 15 mins. 2. 3. Actual results: some computing nodes crashed Expected results: Additional info: The script was compiled with command "stap -r "2.6.18-348.3.1.el5" -DMAXMAPENTRIES=10240 client.stp -m iopattern.ko " and run by command "staprun -o myoutput.raw -x pid iopattern.ko" Unfortunately the client node has not configured crashdump, the last screen output is also attached. An normal output of the script is also attached.
Please attach the last-screen-output and other basic content if able, plus the output of "stap-report". Depending on the details, it may suggest a problem in systemtap, or the kernel. Please note that a slightly newer version 1.8 of systemtap was included in a later verion of RHEL5, and a newer one yet in the Developer Toolset package. Plus one can build a version out of upstream/git which will work on RHEL5.
Please note that without more information, we will be unable to diagnose the problem further. If this problem still occurs, please supply data as per comment #1.