974319 – clear_running_commands XML-RPC call fails with MemoryError

Bug 974319 - clear_running_commands XML-RPC call fails with MemoryError

Summary: clear_running_commands XML-RPC call fails with MemoryError

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	lab controller
Sub Component:
Version:	0.12
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	0.13.x
Assignee:	Nick Coghlan
QA Contact:	tools-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-06-14 00:21 UTC by Nick Coghlan
Modified:	2018-02-06 00:41 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Clones:	974352 (view as bug list)
Environment:
Last Closed:	2013-07-11 02:44:31 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	974352	0	unspecified	CLOSED	Log implicit XML-RPC retries on lab controller	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	975220	0	unspecified	CLOSED	createrepo 0.9.9 fails with MemoryError	2021-02-22 00:41:40 UTC

Internal Links: 974352 975220

Description Nick Coghlan 2013-06-14 00:21:05 UTC

Description of problem:

beaker-provision was restarted with 51 Running operations in the command queue. It failed to restart because the "clear_running_commands" call back to the main server was failing with MemoryError.

Version-Release number of selected component (if applicable):

0.12.1

How reproducible:

Always (with those 51 commands listed as "Running" on the server)

Steps to Reproduce:
1.
2.
3.

Actual results:

XML-RPC call reports MemoryError and beaker-provision fails to start

Expected results:

Running commands are marked as Failed (aborting the associated recipes), beaker-provision starts up and begins processing commands.

Additional info:

Comment 1 Nick Coghlan 2013-06-20 04:48:44 UTC

Investigation has suggested that this only arises if many stale commands (see bug 974352) have accumulated in the server's command queue, and then the provisioning daemon on the lab controller is restarted.

Due to the lack of a direct link between commands and recipes, it's also risky to execute the state update callbacks for such stale commands - if the system has since been reallocated, the stale callback may end up aborting an unrelated recipe.

Accordingly, the "clear_running_commands" operation will be updated to directly execute the following SQL before moving on to handling more recent commands:

  UPDATE command_queue
  JOIN activity ON activity.id=command_queue.id
  SET command_queue.status="Aborted"
  WHERE command_queue.status = "Running"
    AND activity.created < DATE_ADD(UTC_TIMESTAMP(), INTERVAL -1 DAY);

Any remaining commands (those less than 24 hours old) will then be processed in independent transactions.

Comment 4 Nick Coghlan 2013-06-24 05:22:51 UTC

On Gerrit: http://gerrit.beaker-project.org/#/c/2036/

Comment 7 xjia 2013-06-28 02:23:03 UTC

For this bug, I will verify it on July 2nd while doing acceptance test.

Comment 9 Amit Saha 2013-07-11 02:44:31 UTC

Beaker 0.13.2 has been released. (http://beaker-project.org/docs/whats-new/release-0.13.html#beaker-0-13-2).

Note You need to log in before you can comment on or make changes to this bug.