Bug 1847770 - [dcn] instance boot fails on DistributedComputeHCIScaleOut node: Error contacting glance server 'http://172.25.2.182:9292' for 'get', done trying.: glanceclient.exc.HTTPServiceUnavailable: HTTP 503 Service Unavailable
Summary: [dcn] instance boot fails on DistributedComputeHCIScaleOut node: Error contac...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 16.1 (Train on RHEL 8.2)
Assignee: John Fulton
QA Contact: bkopilov
URL:
Whiteboard:
Depends On:
Blocks: 1802772
TreeView+ depends on / blocked
 
Reported: 2020-06-17 01:12 UTC by John Fulton
Modified: 2020-07-29 07:54 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200616081526.396affd.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-29 07:53:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1883807 0 None None None 2020-06-17 01:17:18 UTC
OpenStack gerrit 737188 0 None MERGED [TRAIN ONLY] Fix the glance-api-edge firewall configuration 2021-01-19 20:52:50 UTC
Red Hat Product Errata RHBA-2020:3148 0 None None None 2020-07-29 07:54:39 UTC

Description John Fulton 2020-06-17 01:12:35 UTC
When deploying HCI nodes on the edge with the following node count per DCN site:

  DistributedComputeHCICount: 3
  DistributedComputeHCIScaleOutCount: 1

instances booted on DistributedComputeHCI like this work fine: 

openstack server create instance2 --image 42af42c6-c663-49b1-a800-feb6f2547371 --flavor m1.nano --nic net-id=afb1c6dd-b425-4cec-b0d3-c70cc837df01 --availability-zone az-dcn1 --hypervisor-hostname dcn1-computehci1-0.redhat.local

However instances fail to boot on DistributedComputeHCIScaleOut nodes when running the following: 

openstack server create instance3 --image 42af42c6-c663-49b1-a800-feb6f2547371 --flavor m1.nano --nic net-id=afb1c6dd-b425-4cec-b0d3-c70cc837df01 --availability-zone az-dcn1 --hypervisor-hostname dcn1-computehciscaleout1-0.redhat.local 

The following is seen in the nova-compute log:

/var/log/containers/nova/nova-compute.log:60:2020-06-16 11:05:26.822 8 ERROR nova.image.glance [req-05906d7b-ca3e-4977-bbb3-3d400c85f9ea efb7deed85904588a504616e95035276 699ce3c09f914440ae9408375f5af143 - default default] Error contacting glance server 'http://172.25.2.182:9292' for 'get', done trying.: glanceclient.exc.HTTPServiceUnavailable: HTTP 503 Service Unavailable: No server is available to handle this request.
/var/log/containers/nova/nova-compute.log:61:2020-06-16 11:05:26.822 8 ERROR nova.image.glance Traceback (most recent call last):
/var/log/containers/nova/nova-compute.log:62:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/nova/image/glance.py", line 193, in call
/var/log/containers/nova/nova-compute.log:63:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     result = getattr(controller, method)(*args, **kwargs)
/var/log/containers/nova/nova-compute.log:64:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/glanceclient/v2/images.py", line 198, in get
/var/log/containers/nova/nova-compute.log:65:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     return self._get(image_id)
/var/log/containers/nova/nova-compute.log:66:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/glanceclient/common/utils.py", line 598, in inner
/var/log/containers/nova/nova-compute.log:67:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     return RequestIdProxy(wrapped(*args, **kwargs))
/var/log/containers/nova/nova-compute.log:68:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/glanceclient/v2/images.py", line 191, in _get
/var/log/containers/nova/nova-compute.log:69:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     resp, body = self.http_client.get(url, headers=header)
/var/log/containers/nova/nova-compute.log:70:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 386, in get
/var/log/containers/nova/nova-compute.log:71:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     return self.request(url, 'GET', **kwargs)
/var/log/containers/nova/nova-compute.log:72:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/glanceclient/common/http.py", line 387, in request
/var/log/containers/nova/nova-compute.log:73:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     return self._handle_response(resp)
/var/log/containers/nova/nova-compute.log:74:2020-06-16 11:05:26.822 8 ERROR nova.image.glance   File "/usr/lib/python3.6/site-packages/glanceclient/common/http.py", line 126, in _handle_response
/var/log/containers/nova/nova-compute.log:75:2020-06-16 11:05:26.822 8 ERROR nova.image.glance     raise exc.from_response(resp, resp.content)
/var/log/containers/nova/nova-compute.log:76:2020-06-16 11:05:26.822 8 ERROR nova.image.glance glanceclient.exc.HTTPServiceUnavailable: HTTP 503 Service Unavailable: No server is available to handle this

Comment 1 John Fulton 2020-06-17 01:20:04 UTC
WORKAROUND:

Open port 9292 on the DistributedComputeHCI nodes

Comment 2 John Fulton 2020-06-17 01:20:48 UTC
Suspected root cause

The GlanceAPI service opens port 9292 because glance-api-container-puppet.yaml says so [1]
When the GlanceApiEdge service were created [2] it didn't include a directive to open port 9292
The glance-api-edge-container-puppet.yaml [3] probably needs something similar to [1] 

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/glance/glance-api-container-puppet.yaml#L365
[2] https://github.com/openstack/tripleo-heat-templates/commit/30ca49bf611d0c3b443df6e7a636628dee281303
[3] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/glance/glance-api-edge-container-puppet.yaml

Comment 20 errata-xmlrpc 2020-07-29 07:53:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148


Note You need to log in before you can comment on or make changes to this bug.