Bug 836396 - beaker-provision does not sleep between retries of power commands
beaker-provision does not sleep between retries of power commands
Status: CLOSED CURRENTRELEASE
Product: Beaker
Classification: Community
Component: lab controller (Show other bugs)
0.9
Unspecified Unspecified
unspecified Severity unspecified (vote)
: 0.9.1
: ---
Assigned To: Dan Callaghan
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-28 18:36 EDT by Dan Callaghan
Modified: 2012-07-19 20:39 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 837460 (view as bug list)
Environment:
Last Closed: 2012-07-19 20:39:05 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dan Callaghan 2012-06-28 18:36:35 EDT
If a power script fails, we will retry it up to 5 times before considering it really failed. This is to work around some (apparently) flakey power commands/hardware. The behaviour is inherited from Cobbler. But Cobbler also sleeps for 2 seconds between retries, presumably to avoid hammering the power hardware if it is indeed flaking out. We should do the same.

Bill also suggested that we could do exponential backoff rather than always waiting 2 seconds, which is a nice idea. Maybe with a small random component too, in case we are fighting with another command which is touching the same power hardware.
Comment 1 Dan Callaghan 2012-06-28 19:26:31 EDT
Another idea I just had was to serialize the power commands by power address. Of course we already serialize them by system, but it might help with the APC PDUs for example which I imagine might have 20 different systems hooked up to the same PDU. It will prevent us trying to run multiple power commands against the same PDU concurrently, which I don't think they like.
Comment 2 Dan Callaghan 2012-06-28 19:48:38 EDT
On Gerrit: http://gerrit.beaker-project.org/1166
Comment 3 Bill Peck 2012-06-29 10:52:46 EDT
(In reply to comment #1)
> Another idea I just had was to serialize the power commands by power
> address. Of course we already serialize them by system, but it might help
> with the APC PDUs for example which I imagine might have 20 different
> systems hooked up to the same PDU. It will prevent us trying to run multiple
> power commands against the same PDU concurrently, which I don't think they
> like.

I think this is a great idea.  Would you simply use the $power_address to determine that?
Comment 4 Dan Callaghan 2012-07-03 21:24:42 EDT
(In reply to comment #3)
> (In reply to comment #1)
> > Another idea I just had was to serialize the power commands by power
> > address. Of course we already serialize them by system, but it might help
> > with the APC PDUs for example which I imagine might have 20 different
> > systems hooked up to the same PDU. It will prevent us trying to run multiple
> > power commands against the same PDU concurrently, which I don't think they
> > like.
> 
> I think this is a great idea.  Would you simply use the $power_address to
> determine that?

Right.

This is a separate idea though, and it's not totally easy, so I've cloned it to bug 837460. Let's leave this bug for sleeping between retries.
Comment 5 Jeff Burke 2012-07-12 08:02:41 EDT
Dan,
 Do you know if the sleep change will be rolled out as a hotfix?

Thanks,
Jeff
Comment 9 Dan Callaghan 2012-07-19 20:39:05 EDT
Beaker 0.9.1 has been released.

Note You need to log in before you can comment on or make changes to this bug.