Red Hat Bugzilla – Bug 1303154
Deploy times are longer in 7.2 vs 7.1.
Last modified: 2016-10-05 15:37:27 EDT
Description of problem:
A deploy with no network isolation (only using the --templates parameter, but no -e parameters) used to take 25-35 minutes, and this now takes 50-71 minutes.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.deploy with 7.1
2.deploy with 7.2
3.notice time difference
nearly 3x longer times in some cases.
Will attach sosreport from undercloud after failed 7.2 deploy.
Customer thinks the deploy fails because of some timeout that may also be related to longer deploy times.
This is pretty much What I have found from the undercloud sosreport at this time. Any suggestions what else to ask for or look at? Thanks.
2016-01-28 16:00:49.155 57522 ERROR glance.registry.client.v1.client [req-4e3ea720-2393-4bcf-80e8-9c7ed3c55f5b 87ee5216df014b1bbb17ceb4057c080f 9df580d8d6904762a0edd0d49d3f9092 - - -] Registry client request GET /images/bm-deploy-ramdisk raised NotFound
2016-01-28 18:12:49.245 60446 ERROR wsme.api [-] Server-side error: "Invalid control character '\n' at: line 1 column 137 (char 136)". Detail:
2016-01-28 18:44:30.290 60467 ERROR oslo_messaging._drivers.common [-] Returning exception Node 77cb10b9-9e21-461b-9297-eb23316bdd73 is associated with instance 9c3811a9-100f-4fc6-96b7-2c466f58e6c2. to caller
2016-01-28 18:44:30.290 60467 ERROR oslo_messaging._drivers.common [-] ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\n executor_callback))\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\n executor_callback)\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch\n result = func(ctxt, **new_args)\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner\n return func(*args, **kwargs)\n', ' File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 405, in update_node\n node_obj.save()\n', ' File "/usr/lib/python2.7/site-packages/ironic/objects/base.py", line 143, in wrapper\n return fn(self, ctxt, *args, **kwargs)\n', ' File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 265, in save\n self.dbapi.update_node(self.uuid, updates)\n', ' File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 338, in update_node\n return self._do_update_node(node_id, values)\n', ' File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 364, in _do_update_node\n instance=ref.instance_uuid)\n', 'NodeAssociated: Node 77cb10b9-9e21-461b-9297-eb23316bdd73 is associated with instance 9c3811a9-100f-4fc6-96b7-2c466f58e6c2.\n']
(In reply to Jeremy from comment #0)
I don't think the deployment at the customer site is timing out because the deploy takes too long. I think it is hanging because their network configuration is somehow invalid.
We have increased the base deployment time due to additional verifications and steps to ensure proper upgrade functionality.
I don't think the right way to look at those increases is "3x longer than before", I think the way to properly express those increases in time is "30 minutes longer than before".
So, if a deployment with full network isolation used to take 60 minutes, we would now expect it to take ~90 minutes, not 180. We have added to the deployment time, but it's a net increase, not a factorial increase.
We aren't getting widespread timeouts due to increases in deployment time across the board, so I think there is a misconfiguration in this case.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Obsolete since we already have 7.3 and newer releases, please re-open if still valid.