Bug 1585656
Summary: | LM between OSP 12 and OSP 13 computes fail during major upgrade | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Lee Yarwood <lyarwood> |
Component: | openstack-nova | Assignee: | Lee Yarwood <lyarwood> |
Status: | CLOSED ERRATA | QA Contact: | Archit Modi <amodi> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 13.0 (Queens) | CC: | amodi, berrange, dasmith, egallen, eglynn, jhakimra, kchamart, mbooth, sbauza, sferdjao, sgordon, srevivo, vromanso |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 13.0 (Queens) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-nova-17.0.3-0.20180420001140.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-27 13:57:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lee Yarwood
2018-06-04 10:32:25 UTC
I can also reproduce the same EndpointNotFound error by simply attempting to attach a volume to an instance on the older compute node : $ . overcloudrc.v3 $ openstack server create --flavor m1.tiny --image cirros --network private --availability-zone nova:compute-0.localdomain lyarwood $ cinder create 1 $ nova volume-attach lyarwood 657a9e82-4463-41b3-9290-c6b1ae785b7b 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server [req-3d6e865d-4520-4b49-9dc5-6912b14a896c e16a043a84b14e2b8afbdd1b8677259f cb92ed750eac463faf8935cb137f1e60 - default default] Exception during message handling: EndpointNotFound: internalURL endpoint for volumev2 service named cinderv2 not found 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 76, in wrapped 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server function_name, call_dict, binary) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server self.force_reraise() 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 67, in wrapped 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 218, in decorated_function 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info()) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server self.force_reraise() 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 206, in decorated_function 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4863, in attach_volume 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server do_attach_volume(context, instance, driver_bdm) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4861, in do_attach_volume 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server bdm.destroy() 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server self.force_reraise() 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4858, in do_attach_volume 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return self._attach_volume(context, instance, driver_bdm) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4886, in _attach_volume 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server self.volume_api.unreserve_volume(context, bdm.volume_id) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 235, in wrapper 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server res = method(self, ctx, *args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 257, in wrapper 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server res = method(self, ctx, volume_id, *args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 360, in unreserve_volume 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server cinderclient(context).volumes.unreserve(volume_id) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 116, in cinderclient 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server url = _SESSION.get_endpoint(auth, **service_parameters) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 947, in get_endpoint 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return auth.get_endpoint(self, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/context.py", line 78, in get_endpoint 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server region_name=region_name) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return wrapped(*args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 344, in url_for 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server endpoint_id=endpoint_id).url 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server return wrapped(*args, **kwargs) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 407, in endpoint_data_for 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server raise exceptions.EndpointNotFound(msg) 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server EndpointNotFound: internalURL endpoint for volumev2 service named cinderv2 not found 2018-06-04 12:40:22.452 1 ERROR oslo_messaging.rpc.server 2018-06-04 12:40:34.554 1 INFO nova.compute.resource_tracker [req-6b36fe83-3688-421a-88a5-8ca82b3b4aea - - - - -] Final resource view: name=compute-0.localdomain phys_ram=6143MB used_ram=5632MB phys_disk=29GB used_disk=3GB total_vcpus=3 used_vcpus=3 pci_stats=[] So I think the issue here is that nova.conf on the older compute is still pointing to an old InternalURL endpoint that has been removed in 13: $ sudo docker exec -u root -it nova_compute bash # grep ^catalog_info /etc/nova/nova.conf catalog_info=volumev2:cinderv2:internal (In reply to Lee Yarwood from comment #4) > catalog_info=volumev2:cinderv2:internal Ops, I copied the edited version, the original version was before I manually updated it within the running container: catalog_info=volumev2:cinderv2:internalURL Through judicious use of additional log statements I've established that the immediate cause of the issue is that the relevant code is instantiating a ServiceCatalogV2 object, which appends 'URL' to the interface name if it's not there, whereas the data appears to require V3, which doesn't have URL in interface name. ServiceCatalogV2 appears to be hardcoded in nova.context._ContextAuthPlugin Ignore last 2. This appears to be fine. However, I added logging to ServiceCatalog.normalize_Catalog(), and at the point where we run ServiceCatalog.get_endpoints_data in the failure call path, self._catalog only contains entries for cinderv3 and placement. However, another log entry does contain cinderv2 as well as a bunch of other entries. So I guess the immediate question is: why is ServiceCatalog._catalog incomplete when called in this context, but not another? Fun times, I traced this back to the Queens n-api process that creates the initial request context later used by the Pike n-cpu compute. The following change introduced during Queens strips this request context of the catalog entry for the volumev2 endpoint: Update cinder in RequestContext service catalog https://review.openstack.org/#/c/510947/ I'll file a bug upstream for this now and look at reverting the change on stable/queens or at least reintroducing volumev2. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |