RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 624657 - Timing issue in systemtap initscript restart command
Summary: Timing issue in systemtap initscript restart command
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: systemtap
Version: 6.0
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: David Smith
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-08-17 12:00 UTC by Petr Muller
Modified: 2016-09-20 02:07 UTC (History)
3 users (show)

Fixed In Version: systemtap-1.4-2.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 644350 (view as bug list)
Environment:
Last Closed: 2011-05-19 13:54:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0651 0 normal SHIPPED_LIVE systemtap bug fix and enhancement update 2011-05-19 09:37:25 UTC

Description Petr Muller 2010-08-17 12:00:05 UTC
Description of problem:
Our Beaker test for the initscript sometimes fails to restart the service when some script is running. When investigating this issue, I've found out this is probably a timing issue: if I add a slight lag (like, sleep 3) between 'stop' and 'start' calls in 'restart' function, the issue disappears.

Version-Release number of selected component (if applicable):
systemtap-1.2-9.el6

How reproducible:
On some boxes, not always. When it appears, it is usually reproducible.

Steps to Reproduce:
1. cat > /etc/systemtap/script.d/heart.stp << EOF
> probe timer.ms(500){
>   print("Beat!\n");
> }
> EOF
2. # service systemtap start; sleep 1; service systemtap restart;
  
Actual results:
Starting systemtap:  Compiling heart ... done
 Starting heart ... done
                                                           [  OK  ]
Stopping systemtap:                                        [  OK  ]
Starting systemtap: heart is dead, but another script is running.
                                                           [FAILED]

Expected results:
Starting systemtap: [  OK  ]
Stopping systemtap: [  OK  ]
Starting systemtap:  Compiling heart ... done
 Starting heart ... done
[  OK  ]

Additional info:

Comment 1 David Smith 2010-11-23 20:41:33 UTC
I haven't been able to duplicate this (tried on 3 different machines).  On a machine where this happens, can you show me the new info added to /var/log/systemtap.log?

Comment 2 Petr Muller 2010-11-24 12:58:22 UTC
David,

I had a look on the issue and I found I omitted quite important piece of reproducing information: I probably had it configured from the automated test run so I forgot to include it. Sorry about that. I see the issue after doing:

# echo "heart_OPT='-o /tmp/stap-test.log'" > /etc/systemtap/conf.d/heart.conf

before doing step 2. I haven't managed to reproduce the problem without this. Even with it, I had to run the start-sleep-restart triple in a loop, seeing it in about 1 of 5 cases on one box. I can see it failing consistently on s390x, though. 

This shows up in /var/log/systemtap.log:
# tail -f /var/log/systemtap.log
Nov 24 07:57:12: Starting systemtap: 
Nov 24 07:57:12:  Starting heart ... 
Nov 24 07:57:12: Exec: /usr/bin/staprun -o /tmp/stap-test.log -D /var/cache/systemtap/2.6.32-71.el6.ppc64/heart.ko
Nov 24 07:57:12: Exec: cp -f ./pid /var/run/systemtap/heart
Nov 24 07:57:12: done
Nov 24 07:57:12: Pass: systemtap startup
Nov 24 07:57:13: Stopping systemtap: 
Nov 24 07:57:13: Exec: kill -TERM 3787
Nov 24 07:57:13: Pass: systemtap stopping 
Nov 24 07:57:13: Starting systemtap: 
Nov 24 07:57:13: heart is dead, but another script is running.
Nov 24 07:57:13: Error: Failed to run "heart". (4)

Comment 3 David Smith 2010-11-30 20:27:03 UTC
Fixed in upstream commit 671a1d8:

<http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=commitdiff;h=671a1d824ff1320f9e2fa3ed27d5458cc44a5dcc>

Using the 'heart_OPT' configuration allowed me to reproduce this problem.  Basically we were sending stapio a signal to make it unload the module, but not waiting on the module to unload.

While testing the solution to the stopping problem, I ran into a related, but different, problem when loading the module.  When the '-D' option is used, staprun detaches from the terminal and then prints the pid.  Then we'd check the contents of the pid file before it was written.

The above commit fixes both problems.

Comment 7 errata-xmlrpc 2011-05-19 13:54:36 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0651.html


Note You need to log in before you can comment on or make changes to this bug.