Bug 1691453

Summary: [RFE] Ability to restart the machine while the remote execution job is still acting as running - backend
Product: Red Hat Satellite Reporter: Bryan Kearney <bkearney>
Component: Remote ExecutionAssignee: Ivan Necas <inecas>
Status: CLOSED ERRATA QA Contact: Peter Ondrejka <pondrejk>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.4CC: aruzicka, egolov, inecas
Target Milestone: 6.7.0Keywords: FutureFeature
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-foreman_remote_execution_core-1.3.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-14 13:24:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryan Kearney 2019-03-21 15:56:59 UTC
Some scenarios in rex involve need to disconnect from network and continue some work offline. While we have support for `async`, it still marks the job as failed when the managed hosts goes offline.

With this feature, we should be able to take control over the job status from the remote host, so that the job is marked as running even when the host goes down temporary as part of execution of the job.

Example of such a template:

<pre><code>$CONTROL_SCRIPT manual-mode
cat <<HELP | $CONTROL_SCRIPT update >/dev/null 
The script has switched to manual-mode. It will be acting
as running after this script finishes.

The control script is available in \$CONTROL_SCRIPT
env varaible.

To send output data to the job, on can do something like this:

    echo Hello world | $CONTROL_SCRIPT update

To mark the script as finished, one can do

   $CONTROL_SCRIPT finish 0

there the second argument should be the exit code the 
script ended with.
HELP
</code></pre>

After running this, one should be able to go to the remote host, reboot it and the job should still be running until @$CONTROL_SCRIPT finish 0@ is finished. Additional output can be sent to the job with @echo Hello world | $CONTROL_SCRIPT update@

Additional note:

the satellite needs to be installed with @--foreman-proxy-plugin-remote-execution-ssh-async-ssh=true@ in order for this to work.

Comment 1 Bryan Kearney 2019-03-21 15:57:01 UTC
Created from redmine issue https://projects.theforeman.org/issues/26428

Comment 2 Bryan Kearney 2019-03-21 15:57:02 UTC
Upstream bug assigned to None

Comment 4 Bryan Kearney 2019-03-21 15:58:29 UTC
*** Bug 1691418 has been marked as a duplicate of this bug. ***

Comment 5 Bryan Kearney 2019-04-26 16:00:27 UTC
Upstream bug assigned to inecas

Comment 6 Bryan Kearney 2019-04-26 16:00:29 UTC
Upstream bug assigned to inecas

Comment 7 Bryan Kearney 2019-09-30 12:00:29 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/26428 has been resolved.

Comment 8 Peter Ondrejka 2020-01-28 13:55:04 UTC
Verified in sat 6.7 snap 10 using template '$CONTROL_SCRIPT manual-mode; <%= input("command") %>; $CONTROL_SCRIPT finish 0', for user input passing various sleeps and echos. Verified also that other rex functions are not affected by having --foreman-proxy-plugin-remote-execution-ssh-async-ssh enabled by default.

Comment 11 errata-xmlrpc 2020-04-14 13:24:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1454