1008509 – beaker-provision does not kill power script child processes

Bug 1008509 - beaker-provision does not kill power script child processes

Summary: beaker-provision does not kill power script child processes

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	lab controller
Sub Component:
Version:	0.14
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	0.14.2
Assignee:	Raymond Mancy
QA Contact:	tools-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-09-16 13:51 UTC by Raymond Mancy
Modified:	2018-02-06 00:41 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-11-07 01:47:49 UTC
Embargoed:

Attachments	(Terms of Use)

Description Raymond Mancy 2013-09-16 13:51:50 UTC

Description of problem:

Systems that share the same management interface can sometimes stop each other from running jobs if a blocking process goes awol.


Version-Release number of selected component (if applicable):

0.14.1

How reproducible:


Steps to Reproduce:
1. ?
2.
3.

Actual results:

Job was not running

Expected results:

beaker-provision should have times out the 'Waiting' command, 

Additional info:

The configure_netboot entries etc wete still in the Queued state (had been for a couple of days), beaker-provision et al were running, so there was no problem with Beaker per se.

Comment 4 Dan Callaghan 2013-09-16 22:42:50 UTC

Beaker-provision already enforces a timeout on power commands and kills the script if the timeout is exceeded. It sounds like the problem here is that the power script spawned a child process (telnet) which wasn't cleaned up.

Beaker-provision should make sure each power command is run in its own progress group and then kill the entire process group on timeout.

Comment 5 Dan Callaghan 2013-09-17 01:18:45 UTC

(In reply to Dan Callaghan from comment #4)
> Beaker-provision should make sure each power command is run in its own
> progress group

process group

Comment 6 Nick Coghlan 2013-09-18 00:23:14 UTC

This is the kind of provisioning reliability fix I'd like us to focus on in 0.16 :)

Comment 8 Dan Callaghan 2013-09-23 23:41:03 UTC

The other case beaker-provision should handle better is when the power script crashes or is killed, and leaves behind child processes. So really it should kill the process group in all cases, not just when timeouts occur.

Comment 10 Raymond Mancy 2013-10-02 00:39:38 UTC

http://gerrit.beaker-project.org/#/c/2322/

Comment 14 Raymond Mancy 2013-10-23 07:00:19 UTC

beaker 0.15.1 has been released.

Comment 15 Raymond Mancy 2013-10-23 07:02:50 UTC

This change has been nominated to be back ported to the 0.14 branch, to be released as part of the next maintenance release 0.14.2.

Comment 16 Nick Coghlan 2013-10-25 06:36:38 UTC

Adjusting target milestone to make the changes backported to 0.14.2 easier to identify. 0.15.0 has enough significant regressions that it shouldn't be used, so the change means that 0.15.1 can be effectively reidentified as the union of that tag and the 0.14.2 target milestone.

Comment 18 Raymond Mancy 2013-10-29 05:55:27 UTC

Verified as per the original intructions on comment#13

Comment 19 Nick Coghlan 2013-11-07 01:47:49 UTC

Closing as addressed in Beaker 0.14.2.

Note You need to log in before you can comment on or make changes to this bug.