Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1965023

Summary: Expired manifest leads to 403 during finalizing cloud connector setup
Product: Red Hat Satellite Reporter: Paul Dudley <pdudley>
Component: RH Cloud - Cloud ConnectorAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED WONTFIX QA Contact: Lukáš Hellebrandt <lhellebr>
Severity: low Docs Contact:
Priority: unspecified    
Version: 6.9.0CC: aruzicka, ehelms, pcreech, sshtein
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: foreman_rh_cloud_5.0.32 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-28 18:04:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Dudley 2021-05-26 14:51:20 UTC
Cloud connector job fails with the following error:
~~~
TASK [project-receptor.satellite_receptor_installer : Identify Satellite source type] ***
fatal: [hostname.example.com]: FAILED! => {"changed": false, "connection": "close", "content": "<HTML><HEAD><TITLE>Error</TITLE></HEAD><BODY>\nAn error occurred while processing your request.<p>\nReference&#32;&#35;52&#46;af5dda17&#46;1621963418&#46;6b0fa56a\n</BODY></HTML>\n", "content_length": "176", "content_type": "text/html", "date": "Tue, 25 May 2021 17:23:38 GMT", "elapsed": 0, "expires": "Tue, 25 May 2021 17:23:38 GMT", "mime_version": "1.0", "msg": "Status code was 403 and not [200]: HTTP Error 403: Forbidden", "redirected": false, "server": "AkamaiGHost", "status": 403, "url": "https://cert.cloud.redhat.com/api/sources/v2.0/source_types?name=satellite"}
~~~

Version-Release number of selected component (if applicable):
Satellite 6.9

Steps to Reproduce:
1. Create two organizations
2. Upload non-expired manifest to Org 1
3. Upload expired manifest to Org 2
4. Run cloud connector job from Org 1

Actual results:
Cloud connector fails completely

Expected results:
Since Org 1 has a working valid manifest and Org 2 does not, it would be ideal if Org 1 were separated completely from Org 2 and they would succeed or fail independently from one another.

Additional info:
This error is due to the nature of the receptor setup. A cert and key is created for each Organization upload based on account rather than Satellite Org, so it is possible to have one `/etc/receptor` location for two Orgs to upload from. If the Org lower on the chain in processing (Org 2 for example) had an expired manifest that cert and key will be used and cloud connector will fail as seen above.

Comment 1 Adam Ruzicka 2021-05-27 07:47:09 UTC
As a workaround, you should be able to set skip_satellite_org_id_list parameter on the satellite host to a list of organization ids which have expired manifest to skip them.

Comment 2 Paul Dudley 2021-05-27 15:38:18 UTC
Thanks for the idea Adam, I'll include that as a possible workaround for the issue in a kcs.

I noticed as well that even though a subsequent run after the Org 2 manifest has been corrected receptor does not appear to be updated. Running the cloud connector playbook again will correct the certs in /etc/receptor/rh_accountnumber/ but the service still shows failures:
~~~
● receptor - Receptor Node for rh_accountnumber                                                                                                                                    
   Loaded: loaded (/etc/systemd/system/receptor@.service; enabled; vendor preset: disabled)                                                                                                   
   Active: active (running) since Mon 2021-04-05 06:11:07 EDT; 1 months 21 days ago                                                                                                           
 Main PID: 1287 (receptor)                                                                                                                                                                    
   CGroup: /system.slice/system-receptor.slice/receptor                                                                                                                     
           └─1287 /usr/bin/python3 /usr/bin/receptor -c /etc/receptor/rh_accountnumber/receptor.conf -d /var/data/receptor/rh_accountnumber node                                                            
                                                                                                                                                                                              
May 27 11:19:52 hostname.example.com receptor[1287]: aiohttp.client_exceptions.WSServerHandshakeError: 403, message='Invalid response status', url=URL('wss://cert.cloud.redhat.com/wss/
receptor-controller/gateway')                                                                                                                                                                 
May 27 11:19:57 hostname.example.com receptor[1287]: ERROR 2021-05-27 11:19:57,122 aa4c5445-c8a0-4170-874b-45ed889d7a40 ws ws.connect                                                   
May 27 11:19:57 hostname.example.com receptor[1287]: Traceback (most recent call last):                                                                                                 
May 27 11:19:57 hostname.example.com receptor[1287]: File "/usr/lib/python3.6/site-packages/receptor/connection/ws.py", line 53, in connect                                             
May 27 11:19:57 hostname.example.com receptor[1287]: proxy=proxy, proxy_auth=proxy_auth
May 27 11:19:57 hostname.example.com receptor[1287]: File "/usr/lib64/python3.6/site-packages/aiohttp/client.py", line 1012, in __aenter__
May 27 11:19:57 hostname.example.com receptor[1287]: self._resp = await self._coro
May 27 11:19:57 hostname.example.com receptor[1287]: File "/usr/lib64/python3.6/site-packages/aiohttp/client.py", line 738, in _ws_connect
May 27 11:19:57 hostname.example.com receptor[1287]: headers=resp.headers)
May 27 11:19:57 hostname.example.com receptor[1287]: aiohttp.client_exceptions.WSServerHandshakeError: 403, message='Invalid response status', url=URL('wss://cert.cloud.redhat.com/wss/receptor-controller/gateway')
~~~

First I tried removing all content in /var/data/receptor and then in /etc/receptor, but the service still fails after these. Disabling the service with `systemctl disable --now receptor` and running the playbook after now allows the service to run without error:
~~~
● receptor - Receptor Node for rh_accountnumber                                                                                                                                    
   Loaded: loaded (/etc/systemd/system/receptor@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-05-27 11:23:20 EDT; 26s ago
 Main PID: 12049 (receptor)
   CGroup: /system.slice/system-receptor.slice/receptor
           └─12049 /usr/bin/python3 /usr/bin/receptor -c /etc/receptor/rh_accountnumber/receptor.conf -d /var/data/receptor/rh_accountnumber node

May 27 11:23:20 hostname.example.com systemd[1]: Started Receptor Node for rh_accountnumber.
~~~

Please let me know if these behaviors require a different BZ to look into, or if they are covered under the workings of this one.

Thanks!

Comment 3 Adam Ruzicka 2021-05-31 07:24:03 UTC
Please file a new one please. Currently receptor just dumps the certs where they should be and ensure the service is running. From systemd's point of view, the service is running even if the certs are invalid, so subsequent runs correct the certs, but the service never doesn't get restarted.

Comment 4 Paul Dudley 2021-07-28 12:54:59 UTC
Created bz 1986467 regarding receptor restart.

Comment 5 Lukáš Hellebrandt 2022-04-06 14:43:40 UTC
I think this shouldn't be ON_QA. Where is it fixed? RHC is going to replace Receptor but it hasn't happened yet in snap 15.0.
Is the issue fixed in RHC? If yes, RHC is not in snap yet => No reason for ON_QA.
Is the issue fixed in Receptor? If yes, it doesn't matter for 6.11 since Receptor is going to be dropped => No reason for ON_QA.

Failing this on snap 15.0 since there's not even anything to verify yet.

Comment 6 Adam Ruzicka 2022-04-07 08:02:36 UTC
I believe the simplified flow through the RH cloud plugin now has a pre-flight check which should be able to discover this issue in advance, but I'll defer to Shim for details.

Comment 10 Brad Buckingham 2022-09-02 20:25:18 UTC
Upon review of our valid but aging backlog the Satellite Team has concluded that this Bugzilla does not meet the criteria for a resolution in the near term, and are planning to close in a month. This message may be a repeat of a previous update and the bug is again being considered to be closed. If you have any concerns about this, please contact your Red Hat Account team.  Thank you.

Comment 11 Brad Buckingham 2022-09-02 20:30:51 UTC
Upon review of our valid but aging backlog the Satellite Team has concluded that this Bugzilla does not meet the criteria for a resolution in the near term, and are planning to close in a month. This message may be a repeat of a previous update and the bug is again being considered to be closed. If you have any concerns about this, please contact your Red Hat Account team.  Thank you.

Comment 12 Brad Buckingham 2022-10-28 18:04:20 UTC
Thank you for your interest in Red Hat Satellite. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this feel free to contact your Red Hat Account Team. Thank you.

Comment 13 Red Hat Bugzilla 2023-09-18 00:26:59 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days