When scaling a machineset to deploy with virtualmedia via external network, the image used is served from the provisioning network and the provisioning fails if the host cannot access the provisioning network. If you manually set the baremetalhost image to use the host IP then it works. 2021-07-29 15:03:16.468 673 ERROR root [-] Command failed: prepare_image, error: HTTPConnectionPool(host='172.22.0.3', port=6181): Max retries exceeded with url: /images/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2/cached-rhcos-49.84.202107010027-0-openstack.x86_64.qcow2.md5sum (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f432cebbc88>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)): requests.exceptions.ConnectionError: HTTPConnectionPool(host='172.22.0.3', port=6181): Max retries exceeded with url: /images/rhcos-49.84.202107010027-0-openstack.x86_64.qcow2/cached-rhcos-49.84.202107010027-0-openstack.x86_64.qcow2.md5sum (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f432cebbc88>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)) 2021-07-29 15:03:16.468 673 ERROR root Traceback (most recent call last): 2021-07-29 15:03:16.468 673 ERROR root File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 162, in _new_conn 2021-07-29 15:03:16.468 673 ERROR root (self._dns_host, self.port), self.timeout, **extra_kw) 2021-07-29 15:03:16.468 673 ERROR root File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 80, in create_connection 2021-07-29 15:03:16.468 673 ERROR root raise err 2021-07-29 15:03:16.468 673 ERROR root File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 70, in create_connection 2021-07-29 15:03:16.468 673 ERROR root sock.connect(sa) 2021-07-29 15:03:16.468 673 ERROR root File "/usr/lib/python3.6/site-packages/eventlet/greenio/base.py", line 267, in connect 2021-07-29 15:03:16.468 673 ERROR root socket_checkerr(fd) 2021-07-29 15:03:16.468 673 ERROR root File "/usr/lib/python3.6/site-packages/eventlet/greenio/base.py", line 51, in socket_checkerr 2021-07-29 15:03:16.468 673 ERROR root raise socket.error(err, errno.errorcode[err]) 2021-07-29 15:03:16.468 673 ERROR root ConnectionRefusedError: [Errno 111] ECONNREFUSED
If we always use the API VIP in the installer then that should resolve this issue for new clusters, since the image is served from the image cache (which runs on every control plane node). A downside of this is that IPA will use the external network to download the qcow2 image to write to disk. This may be undesirable in some environments, so it may not make sense to make this change (given that the option to use the external network while also enabling a provisioning network is not exposed in the installer, but can only happen on Day 2). Nothing currently modifies existing MachineSets, so for existing clusters or those installed with the provisioning network enabled, there will be an extra step for the user to do to make sure that new Machines are created using the API VIP in the image URL.
We have documentation in a PR to explain how to make this change to the machineset https://github.com/openshift/openshift-docs/pull/35304
(In reply to Caleb Boylan from comment #2) > We have documentation in a PR to explain how to make this change to the > machineset https://github.com/openshift/openshift-docs/pull/35304 Hi, I saw above PR is closed. However the bug is changed to ON_QA. Could you show the correct PR for this bug?
It looks like it's probably https://github.com/openshift/openshift-docs/pull/36089
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.8 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4712