Bug 2042480 - Configure Cloud Connector fails after hostname change; potentially hits all templates
Summary: Configure Cloud Connector fails after hostname change; potentially hits all t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Settings
Version: 6.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 6.11.0
Assignee: Ondřej Ezr
QA Contact: Lukáš Hellebrandt
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-19 15:30 UTC by Lukáš Hellebrandt
Modified: 2023-09-18 04:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-05 14:32:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 34323 0 Normal New The Setting defaults are never updated 2022-01-27 10:40:03 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:32:32 UTC

Description Lukáš Hellebrandt 2022-01-19 15:30:52 UTC
Feel free to flip this to Job Templates component or satellite-change-hostname component if more fit. I discovered this when testing Cloud Connector installation so I'm using that component.

Description of problem:
Cloud Connector installation job fails because it tries to run against an old (before hostname change + satellite-change-hostname) hostname instead of the correct one:

```
1:
[WARNING]: Callback disabled by environment. Disabling the Foreman callback
   2:
plugin.
   3:

   4:
PLAY [all] *********************************************************************
   5:

   6:
TASK [Gathering Facts] *********************************************************
   7:
ok: [<NEW_FQDN>]
   8:

   9:
TASK [project-receptor.satellite_receptor_installer : Can connect to the Satellite with the given username/password] ***
  10:
fatal: [<NEW_FQDN>]: FAILED! => {"changed": false, "content": "", "elapsed": 3, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 113] No route to host>", "redirected": false, "status": -1, "url": "https://<OLD_FQDN>/api/status"}
  11:
PLAY RECAP *********************************************************************
  12:
<NEW_FQDN> : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
  13:
Exit status: 2

```

This is in the job template and Preview feature shows that `foreman_server_url` indeed expands to the old hostname:

```
---
- hosts: all
  vars:
    satellite_url: "<%= foreman_server_url %>"
  roles:
    - project-receptor.satellite_receptor_installer
```

Version-Release number of selected component (if applicable):
Reproduced on Sat 7.0 snap 5.0.

How reproducible:
Deterministic

Steps to Reproduce:
1. Have a Satellite where hostname has been changed and has ReX allowed against itself
2. Run Configure Cloud Connector from Configure -> Inventory Upload page
3. After it fails, see the results in Monitor -> Jobs -> <sat_host>

Actual results:
Job failed, above result

Expected results:
Cloud connector configured

Additional info:
Since `foreman_server_url` expands to the old hostname, this is probably an issue with all job templates or with satellite-change-hostname.

Comment 1 Adam Ruzicka 2022-01-19 16:07:47 UTC
foreman_server_url macro just reads the foreman_url setting. If I'm reading things right the installer derives the value for it from networking configuration on its first run and then caches it in the answer file. The change hostname script should take care of this

Comment 2 Evgeni Golov 2022-01-27 09:44:38 UTC
I can confirm that the `foreman_url` setting isn't updated:

[root@sat70 ~]# hammer setting show --name foreman_url
Id:            foreman_url
Name:          foreman_url
Description:   URL where your Foreman instance is reachable (see also Provisioning > unattended_url)
Category:      General
Settings type: string
Value:         https://sat-7-0-qa-rhel7.tanso.example.com

[root@sat70 ~]# hammer setting show --name unattended_url
Id:            unattended_url
Name:          unattended_url
Description:   URL hosts will retrieve templates from during build, when it starts with https unattended/userdata controllers cannot be accessed via HTTP
Category:      Provisioning
Settings type: string
Value:         http://sat-7-0-qa-rhel7.tanso.example.com

However, we never updated it in the past and it worked…

Comment 3 Evgeni Golov 2022-01-27 09:49:58 UTC
Aha, it seems before we moved to Settings DSL, `foreman_url` (and `unattended_url`) would derive from SETTINGS[:fqdn] which was dynamically determined.

Comment 4 Evgeni Golov 2022-01-27 10:16:57 UTC
After talking to oezr, this is a bug in the new Settings, so moving there.

Comment 5 Ondřej Ezr 2022-01-27 10:40:02 UTC
Created redmine issue https://projects.theforeman.org/issues/34323 from this bug

Comment 6 Bryan Kearney 2022-01-27 12:04:59 UTC
Upstream bug assigned to oezr

Comment 7 Bryan Kearney 2022-01-27 12:05:00 UTC
Upstream bug assigned to oezr

Comment 8 Bryan Kearney 2022-01-31 16:05:59 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/34323 has been resolved.

Comment 9 tstrych 2022-02-04 14:20:45 UTC
Hi hitting the problem with cockpit. 
I think dev's are definitely on the right track. 

Sat had change hostname and settings are still pointing to old one. 

Rex is enabled, but cockpit is not loading after I changed the url in settings, cockpit works fine. 
https://community.theforeman.org/t/remote-execution-cockpit-shows-black-screen-with-message-sorry-try-again/27183?u=aruzicka

Comment 10 Lukáš Hellebrandt 2022-02-09 08:54:50 UTC
Verified with Sat 7.0 snap 8.0.

Cloud Connector installation works successfully => This is fixed, otherwise the installation would fail.
Also, foreman_server_url now expands to the actual Satellite URL.

On the Satellite after running the Cloud Connector installation playbook:
# rpm -q receptor
receptor-0.6.4-2.el7ar.noarch
# systemctl status 'receptor@*' | grep active
   Active: active (running) since Tue 2022-02-08 08:29:24 EST; 19h ago

Comment 11 Lukáš Hellebrandt 2022-02-21 15:21:41 UTC
This is broken again in snap 9.0. Everything happens as described in the OP.

Comment 12 Evgeni Golov 2022-02-21 15:29:02 UTC
There was a bad RPM shipped in Snap 9, fixed for next snap.

Comment 14 Lukáš Hellebrandt 2022-04-05 09:52:44 UTC
Verified again with Sat 7.0 snap 14.0.

Used the same steps as in comment 10.

Comment 15 Lukáš Hellebrandt 2022-05-31 15:49:33 UTC
Failed again with Sat 6.11 snap 22.0, Configure Cloud Connector workflow now again attempts to run against and old FQDN and fails. I suspect this happens for all templates again. This is second snap-wise regression of this BZ.

Comment 16 Lukáš Hellebrandt 2022-06-02 08:43:13 UTC
This time, it seems to be different. foreman_server_url expands to correct value. When trying Preview in the job template, I get the correct FQDN. When I run the job manually, it passes. When I run it using the CCC button, not only does the playbook receive a wrong value but the whole job is shown to be executed against the old FQDN. This seems like somewhere underneath the CCC button, a wrong host is selected.
Can it have something to do with changes in bug 1979092? Shim?
Should I create a different BZ for this, Ondrej?

Comment 17 Shimon Shtein 2022-06-06 11:23:56 UTC
Did you restart the service after the name change? The algorithm for selecting the hostname is here: https://github.com/theforeman/foreman_rh_cloud/blob/22eee66d8312186d8541892d183a3e009c471263/lib/foreman_rh_cloud.rb#L95
It's defined by either an ENV variable, infrastructure facet or default capsule name in this specific order. The value is cached then till the next restart (Satellite assumes the hostname is quite static and should not be changed without restart).

Comment 18 Adam Ruzicka 2022-06-06 11:29:26 UTC
> It's defined by either an ENV variable, infrastructure facet or default capsule name in this specific order.

But that only selects on which host to run the job, not the value that gets rendered inside the template, no?

Comment 20 Lukáš Hellebrandt 2022-06-06 13:14:18 UTC
I've just verified that after running `foreman-maintain services restart`, the job is still being run against an incorrect host.

Comment 26 Lukáš Hellebrandt 2022-06-07 09:51:15 UTC
Reported as a new regression, bug 2094255. Verifying this one.

Comment 29 errata-xmlrpc 2022-07-05 14:32:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Comment 30 Jameer Pathan 2022-09-26 11:45:29 UTC
I am observing this issue on 6.12.
Can we cherry-pick the fix to 6.12 as well?

Refs: https://github.com/theforeman/foreman-packaging/commit/e76bc4d43aa1f6dfbca8feb5799573a1fa36c8b4

Comment 31 Evgeni Golov 2022-09-27 08:15:04 UTC
(In reply to Jameer Pathan from comment #30)
> I am observing this issue on 6.12.
> Can we cherry-pick the fix to 6.12 as well?
> 
> Refs:
> https://github.com/theforeman/foreman-packaging/commit/
> e76bc4d43aa1f6dfbca8feb5799573a1fa36c8b4

This sounds like a regression and yes someone needs to pick that if it's not there.

Comment 32 Jameer Pathan 2022-09-27 12:14:55 UTC
(In reply to Evgeni Golov from comment #31)
> This sounds like a regression and yes someone needs to pick that if it's not
> there.

Filed BZ#2130173

Comment 33 Red Hat Bugzilla 2023-09-18 04:30:30 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.