Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1543636 - Remote Execution SSH-based Power Action remains pending despite having successfully rebooted the host
Summary: Remote Execution SSH-based Power Action remains pending despite having succes...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Remote Execution
Version: 6.2.14
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: 6.4.0
Assignee: satellite6-bugs
QA Contact: Jameer Pathan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-08 21:24 UTC by Pablo Hess
Modified: 2021-12-10 15:39 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-16 18:53:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 22679 0 Normal Closed REX task using job template for reboot might hang despite reboot succeeded 2020-02-26 16:47:13 UTC
Red Hat Knowledge Base (Solution) 3350801 0 None None None 2018-02-12 13:43:09 UTC

Description Pablo Hess 2018-02-08 21:24:12 UTC
Description of problem:
After issuing a host reboot job through SSH-based REX, the job may remain in Pending state on Satellite.  The host reboots just fine as expected but even after 18+ hours the task still shows Pending.


Version-Release number of selected component (if applicable):
Verified on Satellite 6.2.14 (package versions below). Not checked in earlier versions.

## Sat 6.2.14
foreman-proxy-1.11.0.7-1.el7sat.noarch
rubygem-smart_proxy_dynflow-0.1.3.1-1.el7sat.noarch
rubygem-smart_proxy_remote_execution_ssh-0.1.2.6-1.el7sat.noarch
tfm-rubygem-dynflow-0.8.13.6-1.el7sat.noarch
tfm-rubygem-foreman_remote_execution-0.3.0.19-1.el7sat.noarch
tfm-rubygem-smart_proxy_dynflow_core-0.1.3.1-1.el7sat.noarch
tfm-rubygem-smart_proxy_remote_execution_ssh_core-0.1.2.6-1.el7sat.noarch



How reproducible:
Most of the time. Tested on VMs only.

Steps to Reproduce:
1. On the webUI, run a remote execution job.
2. Select "Power Action - SSH Default" as Job Template, then insert target host(s) and set "restart" for action.
3. Execute it immediately.


Actual results:
Most of the times the action will enter Pending state. Less than 20% of the times the job will transition to stopped-success state. Remote host will successfully reboot 100% of the times. No errors are logged to foreman-tasks or dynflow console.



Expected results:
If the host reboots successfully, the task would enter stopped-success state.


Additional info:
Tests were performed using Satellite's internal capsule.

Example of one such job, still pending over 30 minutes after the host was successfully and completely rebooted:

irb(main):003:0> ForemanTasks::Task.find( "136e7e93-a2fc-466c-a916-9ffa880e0fb0")                                                                          
=> #<ForemanTasks::Task::DynflowTask id: "136e7e93-a2fc-466c-a916-9ffa880e0fb0", type: "ForemanTasks::Task::DynflowTask", label: "Actions::RemoteExecution::RunHostsJob", started_at: "2018-02-08 20:46:37", ended_at: nil, state: "running", result: "pending", external_id: "95ecf03e-1ffb-4847-a44f-dfc4f5f93913", parent_task_id: nil, start_at: "2018-02-08 20:46:37", start_before: nil>

Comment 3 Mark Watts 2018-02-14 09:50:32 UTC
I see the same thing when trying to restart RHEL 7 clients; the reboot works but the job never succeeds.

As a workaround, I've cloned the "Power Action - SSH Default" template and modified it to do this:

echo <%= input('action') %> host && sleep 3
<%= case input('action')
      when 'restart'
        'shutdown -r +1'
      else
        'shutdown -h now'
      end %>


This seems to work fine, albeit with a 1 minute delay.

Comment 4 Pavel Moravec 2018-02-20 08:06:59 UTC
Technical cause for the never-completed task is simply the fact that the "reboot" command might not return success return value, before network goes down during the already-initiated shutdown/reboot. Therefore "reboot -r +1" or other modification _ensuring_ the latest command will _always_ return success is a valid workaround/solution.

Comment 6 Pavel Moravec 2018-02-20 09:51:07 UTC
I created foreman issue for it and will open PR - let see what upstream community feedback will be.

IMHO every solution has its own limitation..

Comment 7 Satellite Program 2018-03-05 09:05:41 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/22679 has been resolved.

Comment 9 Jameer Pathan 2018-08-10 11:32:46 UTC
Verified:
 
@satellite 6.4.0 snap 16

Steps:

1. On the webUI, run a remote execution job.
2. Select "Power Action - SSH Default" as Job Template, then insert target host(s) and set "restart" for action.
3. Execute it immediately.

Observation

-the host reboots successfully, the task enter stopped-success state.

Comment 16 Bryan Kearney 2018-10-16 18:53:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927


Note You need to log in before you can comment on or make changes to this bug.