Bug 1738344 - openstack overcloud config download very slow. Also, deployment with config download fails because we bust the 3600 websocket timeout
Summary: openstack overcloud config download very slow. Also, deployment with config d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z8
: ---
Assignee: Emilien Macchi
QA Contact: Victor Voronkov
URL:
Whiteboard:
: 1644917 (view as bug list)
Depends On:
Blocks: 1644920
TreeView+ depends on / blocked
 
Reported: 2019-08-06 21:22 UTC by David Vallee Delisle
Modified: 2023-03-24 15:10 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-common-8.6.8-12.el7ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-03 16:55:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1838861 0 None None None 2019-08-06 21:22:02 UTC
OpenStack gerrit 674584 0 'None' MERGED Fix expensive api call to get deployment_name 2020-05-05 01:51:25 UTC
Red Hat Product Errata RHBA-2019:2624 0 None None None 2019-09-03 16:56:17 UTC

Description David Vallee Delisle 2019-08-06 21:22:03 UTC
Description of problem:
When running openstack overcloud config download, it takes at least, more than 10 minutes. 

When running a deployment with config-download switch, tripleo client fails [1] because the 1h timeout on the websocket in /usr/lib/python2.7/site-packages/tripleoclient/workflows/deployment.py is busted.
The timeout is busted because get_config or GetOvercloudConfig takes too much time to complete.

Emilien told me about this patch: https://review.opendev.org/#/c/674584/

When I add it to the code, it solves all these issues.

Version-Release number of selected component (if applicable):
openstack-tripleo-common-8.6.8-11.el7ost.noarch

How reproducible:
All the time

Additional info:
[1]
~~~
clean_up DeployOvercloud: Connection is already closed.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/osc_lib/shell.py", line 134, in run
    ret_val = super(OpenStackShell, self).run(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 279, in run
    result = self.run_subcommand(remainder)
  File "/usr/lib/python2.7/site-packages/osc_lib/shell.py", line 169, in run_subcommand
    ret_value = super(OpenStackShell, self).run_subcommand(argv)
  File "/usr/lib/python2.7/site-packages/cliff/app.py", line 400, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python2.7/site-packages/tripleoclient/command.py", line 25, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python2.7/site-packages/osc_lib/command/command.py", line 41, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python2.7/site-packages/cliff/command.py", line 184, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py", line 1027, in take_action
    verbosity=self.app_args.verbose_level)
  File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/deployment.py", line 259, in config_download
    3600):
  File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/base.py", line 61, in wait_for_messages
    for payload in websocket.wait_for_messages(timeout=timeout):
  File "/usr/lib/python2.7/site-packages/tripleoclient/plugin.py", line 158, in wait_for_messages
    yield self.recv()['body']['payload']
  File "/usr/lib/python2.7/site-packages/tripleoclient/plugin.py", line 132, in recv
    return json.loads(self._ws.recv())
  File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 313, in recv
    opcode, data = self.recv_data()
  File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 330, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 343, in recv_data_frame
    frame = self.recv_frame()
  File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 377, in recv_frame
    return self.frame_buffer.recv_frame()
  File "/usr/lib/python2.7/site-packages/websocket/_abnf.py", line 361, in recv_frame
    self.recv_header()
  File "/usr/lib/python2.7/site-packages/websocket/_abnf.py", line 309, in recv_header
    header = self.recv_strict(2)
  File "/usr/lib/python2.7/site-packages/websocket/_abnf.py", line 396, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "/usr/lib/python2.7/site-packages/websocket/_core.py", line 452, in _recv
    return recv(self.sock, bufsize)
  File "/usr/lib/python2.7/site-packages/websocket/_socket.py", line 115, in recv
    "Connection is already closed.")
WebSocketConnectionClosedException: Connection is already closed.

~~~

Comment 1 Alex Schultz 2019-08-07 15:36:59 UTC
*** Bug 1644917 has been marked as a duplicate of this bug. ***

Comment 9 Victor Voronkov 2019-08-22 19:22:35 UTC
Verified on compose 2019-08-13.1 
Overcloud deployment using config-download passed without hitting timeout

Comment 12 errata-xmlrpc 2019-09-03 16:55:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2624


Note You need to log in before you can comment on or make changes to this bug.