Bug 1303154 - Deploy times are longer in 7.2 vs 7.1.
Deploy times are longer in 7.2 vs 7.1.
Status: CLOSED CANTFIX
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
high Severity high
: ---
: 10.0 (Newton)
Assigned To: Hugh Brock
Shai Revivo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-29 12:27 EST by Jeremy
Modified: 2016-10-05 15:37 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-10-05 15:37:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeremy 2016-01-29 12:27:04 EST
Description of problem:

A deploy with no network isolation (only using the --templates parameter, but no -e parameters) used to take 25-35 minutes, and this now takes 50-71 minutes.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.deploy with 7.1
2.deploy with 7.2
3.notice time difference

Actual results:
nearly 3x longer times in some cases.

Expected results:
similar times
Additional info:
Will attach sosreport from undercloud after failed 7.2 deploy.
Customer thinks the deploy fails because of some timeout that may also be related to longer deploy times.
Comment 3 Jeremy 2016-02-01 03:59:57 EST
This is pretty much What I have found from the undercloud sosreport at this time. Any suggestions what else to ask for or look at? Thanks.


/glance/api.log
Multiple:
2016-01-28 16:00:49.155 57522 ERROR glance.registry.client.v1.client [req-4e3ea720-2393-4bcf-80e8-9c7ed3c55f5b 87ee5216df014b1bbb17ceb4057c080f 9df580d8d6904762a0edd0d49d3f9092 - - -] Registry client request GET /images/bm-deploy-ramdisk raised NotFound

/ironic/api.log
Multiple:
2016-01-28 18:12:49.245 60446 ERROR wsme.api [-] Server-side error: "Invalid control character '\n' at: line 1 column 137 (char 136)". Detail:

/ironic/ironic-conductor.log
2016-01-28 18:44:30.290 60467 ERROR oslo_messaging._drivers.common [-] Returning exception Node 77cb10b9-9e21-461b-9297-eb23316bdd73 is associated with instance 9c3811a9-100f-4fc6-96b7-2c466f58e6c2. to caller
2016-01-28 18:44:30.290 60467 ERROR oslo_messaging._drivers.common [-] ['Traceback (most recent call last):\n', '  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\n    executor_callback))\n', '  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\n    executor_callback)\n', '  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch\n    result = func(ctxt, **new_args)\n', '  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner\n    return func(*args, **kwargs)\n', '  File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 405, in update_node\n    node_obj.save()\n', '  File "/usr/lib/python2.7/site-packages/ironic/objects/base.py", line 143, in wrapper\n    return fn(self, ctxt, *args, **kwargs)\n', '  File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 265, in save\n    self.dbapi.update_node(self.uuid, updates)\n', '  File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 338, in update_node\n    return self._do_update_node(node_id, values)\n', '  File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 364, in _do_update_node\n    instance=ref.instance_uuid)\n', 'NodeAssociated: Node 77cb10b9-9e21-461b-9297-eb23316bdd73 is associated with instance 9c3811a9-100f-4fc6-96b7-2c466f58e6c2.\n']
Comment 4 Dan Sneddon 2016-02-03 12:35:40 EST
(In reply to Jeremy from comment #0)

I don't think the deployment at the customer site is timing out because the deploy takes too long. I think it is hanging because their network configuration is somehow invalid.

We have increased the base deployment time due to additional verifications and steps to ensure proper upgrade functionality.

I don't think the right way to look at those increases is "3x longer than before", I think the way to properly express those increases in time is "30 minutes longer than before".

So, if a deployment with full network isolation used to take 60 minutes, we would now expect it to take ~90 minutes, not 180. We have added to the deployment time, but it's a net increase, not a factorial increase.

We aren't getting widespread timeouts due to increases in deployment time across the board, so I think there is a misconfiguration in this case.
Comment 6 Mike Burns 2016-04-07 17:07:13 EDT
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.
Comment 8 Jaromir Coufal 2016-10-05 15:37:27 EDT
Obsolete since we already have 7.3 and newer releases, please re-open if still valid.

Note You need to log in before you can comment on or make changes to this bug.