Bug 2158738

Summary: time to pickup kills long running pull jobs, timeout to kill doesn't work in the same scenario
Product: Red Hat Satellite Reporter: Peter Ondrejka <pondrejk>
Component: Remote ExecutionAssignee: Adam Ruzicka <aruzicka>
Status: CLOSED ERRATA QA Contact: Peter Ondrejka <pondrejk>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.13.0CC: aruzicka, pcreech
Target Milestone: 6.13.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: smart_proxy_remote_execution_ssh-0.10.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-03 13:24:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Ondrejka 2023-01-06 12:31:59 UTC
Description of problem:
For pull-mqtt rex jobs, the "time to pickup" kills a long running job, so it behaves like the "timeout to kill" setting. In the same scenario, timeout to kill does not kill a long running pull-mqtt job

Version-Release number of selected component (if applicable):
Satellite 6.13 snap 5

How reproducible:
always

Steps to Reproduce:

1. have a satellite with pull-mqtt rex mode and a registered host with mqtt client
2. Prepare a command job that will execute for some time, e.g: "echo start; sleep 60s; echo done"
3. In rex wizard, set "time to pickup" to 20s
4. Run job

Actual results:
The job fails after 20s with Proxy error: RuntimeError - The job was not picked up in time. 

Furthermore, using "timeout to kill" in step 3 above doesn't result in stopping the job. 

Expected results:
Time to pickup should distinguish between "not picked yet" and "already running" states. Timeout to kill should work on pull jobs as expected

Additional info:
Confirmed that timeout to kill works correctly in non-pull scenarios

Comment 1 Adam Ruzicka 2023-01-17 08:47:59 UTC
Fix was merged in upstream, moving to post

Comment 2 Peter Ondrejka 2023-01-23 11:11:55 UTC
Verified on Satellite 6.13 snap 7, time to pickup and timeout to kill settings now work as expected in pull-mqtt scenario.

Comment 5 errata-xmlrpc 2023-05-03 13:24:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.13 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2097