Bug 1008509 - beaker-provision does not kill power script child processes
beaker-provision does not kill power script child processes
Status: CLOSED CURRENTRELEASE
Product: Beaker
Classification: Community
Component: lab controller (Show other bugs)
0.14
Unspecified Unspecified
unspecified Severity unspecified (vote)
: 0.14.2
: ---
Assigned To: Raymond Mancy
tools-bugs
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-16 09:51 EDT by Raymond Mancy
Modified: 2014-12-07 20:16 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-06 20:47:49 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Raymond Mancy 2013-09-16 09:51:50 EDT
Description of problem:

Systems that share the same management interface can sometimes stop each other from running jobs if a blocking process goes awol.


Version-Release number of selected component (if applicable):

0.14.1

How reproducible:


Steps to Reproduce:
1. ?
2.
3.

Actual results:

Job was not running

Expected results:

beaker-provision should have times out the 'Waiting' command, 

Additional info:

The configure_netboot entries etc wete still in the Queued state (had been for a couple of days), beaker-provision et al were running, so there was no problem with Beaker per se.
Comment 4 Dan Callaghan 2013-09-16 18:42:50 EDT
Beaker-provision already enforces a timeout on power commands and kills the script if the timeout is exceeded. It sounds like the problem here is that the power script spawned a child process (telnet) which wasn't cleaned up.

Beaker-provision should make sure each power command is run in its own progress group and then kill the entire process group on timeout.
Comment 5 Dan Callaghan 2013-09-16 21:18:45 EDT
(In reply to Dan Callaghan from comment #4)
> Beaker-provision should make sure each power command is run in its own
> progress group

process group
Comment 6 Nick Coghlan 2013-09-17 20:23:14 EDT
This is the kind of provisioning reliability fix I'd like us to focus on in 0.16 :)
Comment 8 Dan Callaghan 2013-09-23 19:41:03 EDT
The other case beaker-provision should handle better is when the power script crashes or is killed, and leaves behind child processes. So really it should kill the process group in all cases, not just when timeouts occur.
Comment 10 Raymond Mancy 2013-10-01 20:39:38 EDT
http://gerrit.beaker-project.org/#/c/2322/
Comment 14 Raymond Mancy 2013-10-23 03:00:19 EDT
beaker 0.15.1 has been released.
Comment 15 Raymond Mancy 2013-10-23 03:02:50 EDT
This change has been nominated to be back ported to the 0.14 branch, to be released as part of the next maintenance release 0.14.2.
Comment 16 Nick Coghlan 2013-10-25 02:36:38 EDT
Adjusting target milestone to make the changes backported to 0.14.2 easier to identify. 0.15.0 has enough significant regressions that it shouldn't be used, so the change means that 0.15.1 can be effectively reidentified as the union of that tag and the 0.14.2 target milestone.
Comment 18 Raymond Mancy 2013-10-29 01:55:27 EDT
Verified as per the original intructions on comment#13
Comment 19 Nick Coghlan 2013-11-06 20:47:49 EST
Closing as addressed in Beaker 0.14.2.

Note You need to log in before you can comment on or make changes to this bug.