Bug 1888580 - [16.2] OC deployment fails with Could not establish a connection to the Zaqar websocket.
Summary: [16.2] OC deployment fails with Could not establish a connection to the Zaqar...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: shreshtha joshi
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-15 09:10 UTC by Michele Baldessari
Modified: 2021-09-15 07:10 UTC (History)
8 users (show)

Fixed In Version: puppet-tripleo-11.5.0-2.20201012005919.c49d8de.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-15 07:09:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2021:3483 0 None None None 2021-09-15 07:10:12 UTC

Description Michele Baldessari 2020-10-15 09:10:11 UTC
Description of problem:
On RHOS-16.2-RHEL-8-20201014.n.3

(undercloud) [stack@undercloud-0 ~]$ openstack overcloud node import --instance-boot-option=local /home/stack/instackenv.json
Could not establish a connection to the Zaqar websocket. The command was sent but the answer could not be read.
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 32, in run 
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 41, in run 
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run 
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.6/site-packages/tripleoclient/v1/overcloud_node.py", line 412, in take_action
    instance_boot_option=parsed_args.instance_boot_option
  File "/usr/lib/python3.6/site-packages/tripleoclient/workflows/baremetal.py", line 62, in register_or_update
    with tripleoclients.messaging_websocket() as ws: 
  File "/usr/lib/python3.6/site-packages/tripleoclient/plugin.py", line 223, in messaging_websocket
    cacert=self._instance.cacert)
  File "/usr/lib/python3.6/site-packages/tripleoclient/plugin.py", line 91, in __init__
    self._ws = websocket.create_connection(endpoint)
  File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 511, in create_connection
    websock.connect(url, **options)
  File "/usr/lib/python3.6/site-packages/websocket/_core.py", line 220, in connect
    options.pop('socket', None))
  File "/usr/lib/python3.6/site-packages/websocket/_http.py", line 120, in connect
    sock = _open_socket(addrinfo_list, options.sockopt, options.timeout)
  File "/usr/lib/python3.6/site-packages/websocket/_http.py", line 190, in _open_socket
    raise err 
  File "/usr/lib/python3.6/site-packages/websocket/_http.py", line 170, in _open_socket
    sock.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
[Errno 111] Connection refused
sys:1: ResourceWarning: unclosed <socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 43246)>
sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 40540), raddr=('192.168.24.2', 13000)>
 
 
The exact address it wants to connect to is: '192.168.24.2', 3000). Indeed I see no processes listening on port 3000
 
On an older working osp16.1 I see the following:
 [root@undercloud-0 stdouts]# ss -ntaulpen |grep :3000
tcp   LISTEN 0      128                       192.168.24.2:3000         0.0.0.0:* users:(("haproxy",pid=49434,fd=32)) ino:272791 sk:1d <-> 
 
So haproxy must be misconfigured here and in fact we're missing the following stanza on 16.2:
listen zaqar_ws
  bind 192.168.24.2:3000 ssl crt /etc/pki/tls/private/overcloud_endpoint.pem
  bind 192.168.24.3:9000
  mode http
  http-request set-header Host %[dst]:9000
  option forwardfor
  redirect scheme https code 301 if { hdr(host) -i 192.168.24.2 } !{ ssl_fc }
  rsprep ^Location:\ http://(.*) Location:\ https://\1
  timeout connect 5s
  timeout client 25s 
  timeout server 25s 
  timeout tunnel 14400s
  server undercloud-0 192.168.24.1:9000 check fall 5 inter 2000 rise 2

Comment 3 Vath Sok 2020-10-15 13:44:23 UTC
I just updated rhos-16.2-rhel-8 branch to puppet-tripleo-11.5.0-2.20201012005919.c49d8de.el8ost

Comment 6 David Rosenfeld 2021-06-04 17:18:03 UTC
Command from description is successful:

openstack overcloud node import --instance-boot-option=local /home/stack/instackenv.json
Waiting for messages on queue 'tripleo' with no timeout.


0 node(s) successfully moved to the "manageable" state.
Successfully registered node UUID 893a5f20-b221-4190-95db-70a30c548146
Successfully registered node UUID 0d98fae1-1652-47f1-bd09-5f8f25aaf4d1
Successfully registered node UUID 106b379e-848d-48d2-9cb4-acd94b4bdb17
Successfully registered node UUID 01133752-36f4-46df-9831-730104c91d4b
Successfully registered node UUID a81e3f1e-08f5-48be-b0e5-620f7482ad86

Comment 8 errata-xmlrpc 2021-09-15 07:09:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483


Note You need to log in before you can comment on or make changes to this bug.