Bug 1384093 - RFE: API call to unschedule (possible via webUI) or fail scheduled action for the specific systems
Summary: RFE: API call to unschedule (possible via webUI) or fail scheduled action for...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite 5
Classification: Red Hat
Component: API
Version: 570
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Ondrej Gajdusek
QA Contact: Jan Hutař
URL:
Whiteboard:
: 1418737 (view as bug list)
Depends On:
Blocks: sat580-low
TreeView+ depends on / blocked
 
Reported: 2016-10-12 14:01 UTC by Michal Dekan
Modified: 2020-03-11 15:18 UTC (History)
6 users (show)

Fixed In Version: spacewalk-java-2.5.14-61
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-21 12:08:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michal Dekan 2016-10-12 14:01:50 UTC
Description of problem:

Our use case is to run every night some jobs on our whole set of clients (more than 4000) but some of RHN agents are not working as expected (filesystem full, readonly, server stopped...).
The goal is to unschedule actions only on these systems.

Version-Release number of selected component (if applicable):

Satellite 5.7

How reproducible:


Steps to Reproduce:
1. schedule actions on clients
2. some of them will fail
3. unscheduled them using api

Actual results:

No API call to unscheduled failed action for the specific systems at the moment.

Expected results:

API call to unscheduled failed action for the specific systems available.


Additional info:

Following API documentation, I can't find any way to unscheduled an action on a specific system.
In fact, I'm looking a way to remove actions from pending tasks and move them to completed/failed actions list and archive them when too old.
Currently, I'm doing this by hand but it is very time consuming and should be automated.

Comment 1 Jan Hutař 2016-10-12 19:32:53 UTC
Did some investigation here:

client.schedule.listInProgressActions(key)
  ... returns schedules shown in webUI in Schedule -> Pending Actions
      It is list of dicts like this one:
      {'failedSystems': 0, 'name': 'System reboot', 'completedSystems': 0, 'inProgressSystems': 1, 'scheduler': 'admin', 'earliest': <DateTime '20161012T08:28:00' at 7ffa21bfee18>, 'type': 'System reboot', 'id': 713846}

client.schedule.listInProgressSystems(key, aid)
  ... list systems shown in webUI in Schedule -> Pending Actions -> <action> -> In Progress Systems
      It is list of dicts like this:
      {'timestamp': <DateTime '20161007T05:41:46' at 7ffa20120d40>, 'server_id': 1000091791, 'server_name': 'testrhnmanagerclient_2016_10_06_12_55_04_3683', 'base_channel': 'Name chann_testrhnmanagerclient_2016_10_06_12_55_04_3683'}

client.schedule.cancelActions(key, [aid])
  ... cancels (looks like it deletes) whole action, no matter on how many other
      pending/failed/completed systems are also involved in the action

This bug is to provide API for something you can do via webUI Schedule -> Pending Actions -> <action> -> In Progress Systems -> select one or more -> [Unschedule Action] button OR Schedule -> Pending Actions -> <action> -> In Progress Systems -> <system> -> Events -> Pending -> tick event/action you want to cancel for given system -> [Cancel Selected Events]. This way resulting action looks like action was not even scheduled on a given system - thats not ideal from auditing point of view, but should suffice.

OR:

It IMO it could also help to create API call like `system.failEvent(key, system_id, action_id, optional_message)` which would fail the action on a given system. This way administrator can fail action on certain systems and it would stay part of the action. Optionally provided message could be the thing displayed in Systems -> <system> -> Events -> History -> <action> -> Details and could be used by administrator to express the reason why is he failing the event - e.g. "This was not picked for more than 2 days and is not relevant now".

Comment 2 Tomas Lestach 2016-10-17 11:51:09 UTC
The question is what is meant with 'unscheduling' of an action. Satelllite 5.7 currently allows you to cancel (read delete) any action using the schedule.cancelActions API as JanH describes in Comment 1.

In case the customer wants to 'fail' a pending action, there's no such functionality in Satellite 5.


Michal, can you, please, clarify, whether the customer is ok to delete the actions or if he really wants to fail them?

Comment 3 Jan Hutař 2016-10-18 05:08:10 UTC
Tomes, problem with schedule.cancelActions API is that it deletes whole action. If you schedule action on 3 systems and on 1st it passes, on 2nd it fails and because 3rd system is down or something, you want to "unschedule" it (whatever it means) there via API. schedule.cancelActions API call deletes whole action (not just on the second system). This is a problem when you want to keep some record of passed/failed actions somewhere.

Comment 5 Michal Dekan 2016-10-19 13:40:01 UTC
(In reply to Tomas Lestach from comment #2)

> 
> Michal, can you, please, clarify, whether the customer is ok to delete the
> actions or if he really wants to fail them?

Tomas, customer wants to put these pending actions to complete or fail state (without any pending systems left).

In fact, they've tried to cancel actions but that cancel the whole action, not only on pending systems and some customers lost some jobs. Can see Jan has already clarified this.

Delete pending action on the specific system would work as well for them, however put pending actions to the failed or complete status for the specific system would be preferable.

Comment 6 Tomas Lestach 2016-10-24 09:13:42 UTC
Thanks, Michal.
(Comment 4 is still valid.)

Comment 7 Ondrej Gajdusek 2016-12-20 14:05:10 UTC
implemented in upstream as f952fd3adde83cd958b4bff5571228aec89130cc

Comment 9 Ondrej Gajdusek 2016-12-21 10:05:55 UTC
I did some minor updates for this BZ.
spacewalk/master:
286f09137c8395a9da1bfe3aa7e20ce5430aa703
fc973cd65f3c57890b738e76e51247203673854c

Comment 13 Tomas Lestach 2017-02-03 09:10:55 UTC
*** Bug 1418737 has been marked as a duplicate of this bug. ***

Comment 14 Ondrej Gajdusek 2017-02-22 15:18:20 UTC
Fix is available at spacewalk/master: aca685aff44fc63ba70e82449494b58106a368ae
Action now will store its completion time when it fails.


Note You need to log in before you can comment on or make changes to this bug.