Bug 1008509 - beaker-provision does not kill power script child processes
Summary: beaker-provision does not kill power script child processes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: lab controller
Version: 0.14
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 0.14.2
Assignee: Raymond Mancy
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-16 13:51 UTC by Raymond Mancy
Modified: 2018-02-06 00:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-07 01:47:49 UTC
Embargoed:


Attachments (Terms of Use)

Description Raymond Mancy 2013-09-16 13:51:50 UTC
Description of problem:

Systems that share the same management interface can sometimes stop each other from running jobs if a blocking process goes awol.


Version-Release number of selected component (if applicable):

0.14.1

How reproducible:


Steps to Reproduce:
1. ?
2.
3.

Actual results:

Job was not running

Expected results:

beaker-provision should have times out the 'Waiting' command, 

Additional info:

The configure_netboot entries etc wete still in the Queued state (had been for a couple of days), beaker-provision et al were running, so there was no problem with Beaker per se.

Comment 4 Dan Callaghan 2013-09-16 22:42:50 UTC
Beaker-provision already enforces a timeout on power commands and kills the script if the timeout is exceeded. It sounds like the problem here is that the power script spawned a child process (telnet) which wasn't cleaned up.

Beaker-provision should make sure each power command is run in its own progress group and then kill the entire process group on timeout.

Comment 5 Dan Callaghan 2013-09-17 01:18:45 UTC
(In reply to Dan Callaghan from comment #4)
> Beaker-provision should make sure each power command is run in its own
> progress group

process group

Comment 6 Nick Coghlan 2013-09-18 00:23:14 UTC
This is the kind of provisioning reliability fix I'd like us to focus on in 0.16 :)

Comment 8 Dan Callaghan 2013-09-23 23:41:03 UTC
The other case beaker-provision should handle better is when the power script crashes or is killed, and leaves behind child processes. So really it should kill the process group in all cases, not just when timeouts occur.

Comment 10 Raymond Mancy 2013-10-02 00:39:38 UTC
http://gerrit.beaker-project.org/#/c/2322/

Comment 14 Raymond Mancy 2013-10-23 07:00:19 UTC
beaker 0.15.1 has been released.

Comment 15 Raymond Mancy 2013-10-23 07:02:50 UTC
This change has been nominated to be back ported to the 0.14 branch, to be released as part of the next maintenance release 0.14.2.

Comment 16 Nick Coghlan 2013-10-25 06:36:38 UTC
Adjusting target milestone to make the changes backported to 0.14.2 easier to identify. 0.15.0 has enough significant regressions that it shouldn't be used, so the change means that 0.15.1 can be effectively reidentified as the union of that tag and the 0.14.2 target milestone.

Comment 18 Raymond Mancy 2013-10-29 05:55:27 UTC
Verified as per the original intructions on comment#13

Comment 19 Nick Coghlan 2013-11-07 01:47:49 UTC
Closing as addressed in Beaker 0.14.2.


Note You need to log in before you can comment on or make changes to this bug.