Description of problem: If a CDS or HAProxy node breaks, it's impossible to delete it from RHUI. That's because rhui-manager tries to connect to that node to stop the relevant services, and if the connection is impossible, the deletion fails. There should be a way to handle this situation so that rhui-manager just forgets about the node internally. Version-Release number of selected component (if applicable): RHUI 3 How reproducible: Always. Steps to Reproduce: 1. Add a CDS node. 2. Make the node unavailable. Break it, turn it off, whatever. 3. Try to delete the node in rhui-manager. Actual results: An unexpected error has occurred during the last operation. More information can be found in /root/.rhui/rhui.log. The log file reads: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/rhui/tools/shell.py", line 88, in safe_listen self.listen(clear=first_run) File "/usr/lib/python2.7/site-packages/rhui/tools/shell.py", line 127, in listen Shell.listen(self) File "/usr/lib/python2.7/site-packages/rhui/common/shell.py", line 186, in listen item.func(*args, **item.kwargs) File "/usr/lib/python2.7/site-packages/rhui/tools/screens/cds.py", line 47, in unregister InstanceScreen.unregister(self, cleanup_script, force=force) File "/usr/lib/python2.7/site-packages/rhui/tools/screens/instances.py", line 298, in unregister self._unregister(unregistered_instances, bash_script=bash_script, force=True) File "/usr/lib/python2.7/site-packages/rhui/tools/screens/instances.py", line 313, in _unregister self.run_sudo(instance, 'rm -rf /var/lib/puppet/ssl') File "/usr/lib/python2.7/site-packages/rhui/tools/screens/instances.py", line 547, in run_sudo return sudo(command, **sudo_kwargs) File "/usr/lib/python2.7/site-packages/fabric/network.py", line 639, in host_prompting_wrapper return func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/fabric/operations.py", line 1095, in sudo stderr=stderr, timeout=timeout, shell_escape=shell_escape, File "/usr/lib/python2.7/site-packages/fabric/operations.py", line 909, in _run_command channel=default_channel(), command=wrapped_command, pty=pty, File "/usr/lib/python2.7/site-packages/fabric/state.py", line 393, in default_channel chan = _open_session() File "/usr/lib/python2.7/site-packages/fabric/state.py", line 380, in _open_session return connections[env.host_string].get_transport().open_session() File "/usr/lib/python2.7/site-packages/fabric/network.py", line 151, in __getitem__ self.connect(key) File "/usr/lib/python2.7/site-packages/fabric/network.py", line 143, in connect self[key] = connect(user, host, port, cache=self) File "/usr/lib/python2.7/site-packages/fabric/network.py", line 565, in connect raise NetworkError(msg, e) NetworkError: Timed out trying to connect to cds01.example.com (tried 1 time) The node is still tracked in RHUI. Expected results: It should be possible to delete the node from RHUI without touching it. Additional info: Apparent workaround: remove the information about the node from /etc/rhui/cds.json / /etc/rhui/haproxy.json.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0149