Bug 916302 - Unrecognised power commands should not lead to a system being marked Broken
Summary: Unrecognised power commands should not lead to a system being marked Broken
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: web UI
Version: 0.11
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 21.1
Assignee: Roman Joost
QA Contact:
URL:
Whiteboard: Provisioning
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-27 19:12 UTC by Jeff Burke
Modified: 2018-02-06 00:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-21 03:25:42 UTC
Embargoed:


Attachments (Terms of Use)

Description Jeff Burke 2013-02-27 19:12:29 UTC
Description of problem:

I saw a ticket get opened today in the eng-ops queue:
 https://engineering.redhat.com/rt/Ticket/Display.html?id=189374

-------------------------------------------------------------------------
Beaker has automatically marked system
intel-s3ea2-02.rhts.eng.bos.redhat.com <https://beaker.engineering.redhat.com/view/intel-s3ea2-02.rhts.eng.bos.redhat.com>
as broken, due to:

Power command failed: ValueError: Power script /etc/beaker/power-scripts/apc_snmp failed after 5 attempts with exit status 1:
Failed: Unrecognised action 'interrupt'
Please use '-h' for usage

Please investigate this error and take appropriate action.

Power type: apc_snmp
Power address: pdu-l1a1.mgmt.lab.eng.bos.redhat.com
Power id: 4 5
-------------------------------------------------------------------------

interrupt is only valid for ipmi I believe. It should not be enabled for other power types.

Regards,
Jeff

Comment 1 Bill Therrien 2013-02-28 14:31:27 UTC
I saw a similar error for a system using ipmilan:

-------------------------------------------------------------------------
Beaker has automatically marked system
sun-x4600m2-01.rhts.eng.bos.redhat.com <https://beaker.engineering.redhat.com/view/sun-x4600m2-01.rhts.eng.bos.redhat.com [Open URL]>
as broken, due to:

Power command failed: ValueError: Power script /usr/lib/python2.6/site-packages/bkr/labcontroller/power-scripts/ipmilan failed after 5 attempts with exit status 1:
interrupt not supported by ipmilan


Please investigate this error and take appropriate action.

Power type: ipmilan
Power address: sun-x4600m2-01-ilom.rhts.eng.bos.redhat.com
Power id: None
-------------------------------------------------------------------------

Comment 2 Nick Coghlan 2013-03-04 03:24:48 UTC
We're deliberately trying to avoid Beaker of having any special knowledge of which commands are supported for different power scripts (system administrators are supposed to be able to add new power types with full functionality without needing to update the Beaker server software). At the moment, sysadmins can add new power types without even needing to touch the Beaker *database*.

However, an unsupported command should *not* lead to systems being marked Broken - I have updated the issue title accordingly.

Unfortunately, there's no standard mechanism for power scripts to distinguish between "command is not recognised" and "command was recognised, it just didn't work" when reporting a failure back to Beaker.

Perhaps it would be possible to blacklist certain commands for certain power types. That way, the existing practices wouldn't need to change, but sysadmins would have a clear way to respond to the above kind of error: add the offending power command (the bit after the final trailing "/") to the blacklist for "interrupt".

Comment 3 Nick Coghlan 2013-03-04 03:34:25 UTC
Dan Callaghan pointed out I was mistaken above, and new power types *do* need to be registered in the database before they can be used. That means this can be implemented by either adding a simple flag to each power type to indicate whether or not it supports the "interrupt" command, or adding a more general whitelist feature to indicate the set of supported commands for each power type (with "interrupt" excluded from the whitelist by default).

Comment 5 Jeff Burke 2015-09-10 14:50:23 UTC
Running the interrupt option still results in broken machines. Is there anyway to move this from pm-hss: Beaker‑backlog to the next release?

Thanks,
Jeff

Comment 6 Roman Joost 2015-09-14 05:05:31 UTC
Dear Jeff,

we thought about a quick fix for this bug, which might all it takes and can be out with the next bug fix release (weeks, not months). Supporting a generic list of what commands of each power type can support is a fairly bigger task to implement and might not be necessary here.

Comment 7 Roman Joost 2015-09-15 01:45:56 UTC
Patch available on gerrit:

https://gerrit.beaker-project.org/#/c/4394/

Comment 9 Dan Callaghan 2015-09-25 06:19:21 UTC
Steps to reproduce:
1. Find a system which uses some power type *other than* ipmitool (for example lpar, virsh, zvm, ...)
2. Send "interrupt" command from the power tab on the system page
or: bkr system-power --action=interrupt $fqdn

Actual results:
The interrupt command fails because the power script does not support it, and then the system is marked Broken.

Expected results:
The interrupt command fails but the system should not be marked Broken, it should stay Automated.

Comment 12 Dan Callaghan 2015-10-21 03:25:42 UTC
Beaker 21.1 has been released.


Note You need to log in before you can comment on or make changes to this bug.