Bug 1386270
Summary: | [RFE] Job invocations should happen asynchronously | ||
---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Daniel Lobato Garcia <dlobatog> |
Component: | Remote Execution | Assignee: | Adam Ruzicka <aruzicka> |
Status: | CLOSED ERRATA | QA Contact: | jcallaha |
Severity: | high | Docs Contact: | satellite6-bugs <satellite6-bugs> |
Priority: | high | ||
Version: | 6.1.9 | CC: | aruzicka, bkearney, dcaplan, ealcaniz, fgarciad, inecas, jcallaha, molasaga, m.r.watts, satellite6-bugs, tbrisker |
Target Milestone: | Unspecified | Keywords: | FieldEngineering, FutureFeature, PrioBumpGSS, PrioBumpPM, Triaged |
Target Release: | Unused | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-02-21 16:54:17 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Daniel Lobato Garcia
2016-10-18 14:08:07 UTC
Created redmine issue http://projects.theforeman.org/issues/17514 from this bug Upstream bug assigned to aruzicka Upstream bug assigned to aruzicka Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17514 has been resolved. Has Foreman issue 17514 been ported to Satellite yet? We're on Satellite 6.2.14 and the "--async" option doesn't seem to have any effect as per: # hammer job-invocation create --job-template 'Run Command - SSH Default' --inputs 'command=ls' --search-query name=play01273.example.com --async Job invocation 31 created [...................................................................................................................................................................................................................................] [100%] 1 task(s), 1 success, 0 fail It would be really useful to have the same type of async operation as when using: # hammer host errata apply --errata-ids $errataList --host $host --async @Mark: Please note this feature has nothing to do with hammer's --async flag. Hammer's --async flag tells hammer not to wait for the job invocation to finish. Preliminary steps: This feature can be enabled on a per-proxy basis by setting :async_ssh to true in /etc/smart_proxy_dynflow_core/settings.d/remote_execution_ssh.yml. The interval for checking on the remote jobs can be set in the same file under the runner_refresh_interval key. Apparently it is not exposed in the installer and needs to be uncommented and toggled in the file by hand. Steps to reproduce: 1) Complete the preliminary steps 2) Run a remote execution job which will take some time (sleep 600) 3) Log in to the server and use ss or netstat to look for opened SSH connections 4) (note) it may take up to a minute (iirc) for the kernel to completely "forget" the tcp connection Expected results: There should NOT be a persistent connection opened to the remote host Verified in Satellite 6.3 Snap 35. Negative Test: Kicked off a job that executed the command `sleep 600` Satellite immediately started connection. While the command was running (sleeping), the connection was maintained. Finally Satellite killed the connection once the job was complete. Every 2.0s: ss | grep ssh Wed Feb 14 15:29:49 2018 tcp ESTAB 0 0 <host>:ssh <satellite>:37704 tcp ESTAB 0 0 <host>:ssh <self>:44772 Every 2.0s: ss | grep ssh Wed Feb 14 15:36:22 2018 tcp ESTAB 0 0 <host>:ssh <satellite>:37704 tcp ESTAB 0 0 <host>:ssh <self>:44772 Every 2.0s: ss | grep ssh Wed Feb 14 15:40:26 2018 tcp ESTAB 0 0 <host>:ssh <self>:44772 Positive Test: Added `:async_ssh: true` to /etc/smart_proxy_dynflow_core/settings.d/remote_execution_ssh.yml Restarted satellite. Kicked off the job from before (sleep 600). Satellite checked in on the host, then exited. Satellite then periodically checked in until the job completed. # for i in {1..60}; do ss | grep ssh >> connections.txt && sleep 2; done # cat connections.txt ... tcp ESTAB 0 0 <host>:ssh <self>:44772 tcp ESTAB 0 0 <host>:ssh <self>:44772 tcp ESTAB 0 0 <host>:ssh <self>:44772 tcp ESTAB 0 52 <host>:ssh <satellite>:40490 tcp ESTAB 0 0 <host>:ssh <self>:44772 tcp ESTAB 0 0 <host>:ssh <self>:44772 tcp ESTAB 0 0 <host>:ssh <self>:44772 ... In both cases the job completed successfully. However, only after making the settings change, did the job run asynchronously as expected. Ok, I'm misunderstanding what the --async flag for "hammer job-invocation" is doing here then. My expectation was that: # time hammer job-invocation create --job-template 'Run Command - SSH Default' --inputs 'command=ls' --search-query name=play01273.example.com --async Job invocation 32 created [...................................................................................................................................................................................................................................] [100%] 1 task(s), 1 success, 0 fail real 3m17.119s user 0m1.752s sys 0m0.644s Would return to the console immediately, which it does not. (In reply to Mark Watts from comment #17) > My expectation was that: > > Would return to the console immediately, which it does not. Your expectation was right, however this feature was broken for quite some time and should be fixed in 6.3. Please see the BZ[1] for it. [1] - https://bugzilla.redhat.com/show_bug.cgi?id=1440962 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.
>
> For information on the advisory, and where to find the updated files, follow the link below.
>
> If the solution does not work for you, open a new bug report.
>
> https://access.redhat.com/errata/RHSA-2018:0336
|