Bug 1833039

Summary: Introduce error code to playbook_run_finished response type
Product: Red Hat Satellite Reporter: Adam Ruzicka <aruzicka>
Component: RH Cloud - Cloud ConnectorAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Lukáš Hellebrandt <lhellebr>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.7.0CC: aruzicka, egolov
Target Milestone: 6.8.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-receptor-satellite-1.2.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 13:02:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journal.log
none
0-failure.log none

Description Adam Ruzicka 2020-05-07 17:15:26 UTC
There are certain failure situation that we know may happen (e.g. "This host is not known by Satellite"). Currently there is no way for sat-receptor to indicate these other than adding the error message into the "console" text field.

We should introduce code/error field with well-defined values so that Remediations can understand the semantics of what the problem was without having to parse the text field.

 

https://docs.google.com/document/d/1n8-MVjCc1X6eOczQccndEOvX7knQU323OlI_bUnBVNE/edit#

Copied from RHCLOUD-5370

Comment 3 Adam Ruzicka 2020-08-24 08:53:15 UTC
This was done as part of https://bugzilla.redhat.com/show_bug.cgi?id=1833035, moving to ON_QA.

Maybe we could even move it to verified based on https://bugzilla.redhat.com/show_bug.cgi?id=1833035#c11

Comment 4 Lukáš Hellebrandt 2020-09-17 13:37:39 UTC
FailedQA with Sat 6.8 snap 15.

With debug=true, applied remediation on a client that didn't have satellite's ssh key installed => job failed. Result log attached, note all codes are 0.

Comment 5 Lukáš Hellebrandt 2020-09-17 13:40:02 UTC
Created attachment 1715225 [details]
journal.log

Comment 6 Adam Ruzicka 2020-09-18 12:21:11 UTC
Fix was merged in upstream, moving to POST

Comment 9 Lukáš Hellebrandt 2020-09-29 15:04:09 UTC
Using Sat 6.8 snap 17, tested multiple codes that are returned by receptor:

satellite_connection_code (set wrong password in receptor.conf): 0 if connection between Receptor and Satellite is successful, non-zero otherwise
connection_code (delete foreman-proxy's ssh key from host's authorized_keys): 0 if connection between Satellite and the host is successful, non-zero otherwise
execution_code (change binary used in remediation to 'exit 1'): 0 if playbook finished successfully, non-zero otherwise
satellite_infrastructure_code (not tested): should return 0 when connection between satellite and connection is successful, non-zero otherwise

The only weird thing I encountered during testing is that if connection between Receptor and Satellite fails (I actually set wrong password in receptor.conf), then satellite_connection_code!=0 and connection_code==0. Connection between Satellite and host could, however, never succeed because Satellite wasn't instructed to do anything in the first place. Log of this case attached.
Waiting for feedback from c.rh.c team on whether this is actually what they expect since it doesn't make much sense to me, in combination with status==failure.

Comment 10 Lukáš Hellebrandt 2020-09-29 15:05:28 UTC
Created attachment 1717568 [details]
0-failure.log

Comment 11 Lukáš Hellebrandt 2020-09-30 13:21:47 UTC
Verified with Sat 6.8 snap 17 as per comment 9.

The issue described in the last paragraph of comment 9 doesn't technically break the FiFi Satellite contract and I agreed with Adam to verify this BZ and file a new one about the issue.

Comment 14 errata-xmlrpc 2020-10-27 13:02:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366