Bug 1374971

Summary: rhel-osp-director: os-collect-config[5195]: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/ (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x235b290>
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED WORKSFORME QA Contact: Omri Hochman <ohochman>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: dbecker, jschluet, jslagle, mburns, mcornea, morazi, rhel-osp-director-maint, sasha
Target Milestone: ga   
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-15 14:58:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Alexander Chuzhoy 2016-09-11 05:23:18 UTC
rhel-osp-director:   os-collect-config[5195]: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/ (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x235b290>


Environment:
openstack-puppet-modules-9.0.0-0.20160802183056.8c758d6.el7ost.noarch
instack-undercloud-5.0.0-0.20160818065636.41ef775.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160823140311.72404b.1.el7ost.noarch



Steps to reproduce:
Attempt a deployment of overcloud with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ntp-server clock.redhat.com

Result:
After a long time the deployment fails:
2016-09-10 23:11:17 [ControllerDeployment]: CREATE_IN_PROGRESS state changed
2016-09-10 23:11:18 [NetIpMap]: UPDATE_IN_PROGRESS state changed
2016-09-10 23:11:20 [NetworkConfig]: UPDATE_COMPLETE state changed
2016-09-10 23:11:20 [NetIpMap]: UPDATE_COMPLETE state changed
2016-09-10 23:11:21 [ControllerDeployment]: CREATE_IN_PROGRESS state changed
2016-09-10 23:11:22 [NetIpMap]: UPDATE_COMPLETE state changed
2016-09-10 23:11:23 [ControllerDeployment]: CREATE_IN_PROGRESS state changed
2016-09-11 03:07:29 [Compute]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:29 [1]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:30 [Controller]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:30 [0]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:30 [overcloud]: UPDATE_FAILED Timed out
2016-09-11 03:07:31 [overcloud-Compute-zdwrvanvvl5h]: UPDATE_FAILED Operation cancelled
2016-09-11 03:07:31 [0]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:31 [1]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:31 [2]: UPDATE_FAILED UPDATE aborted
2016-09-11 03:07:32 [overcloud-Controller-hhcxmpnnkvgk]: UPDATE_FAILED Operation cancelled
Stack overcloud UPDATE_FAILED
Heat Stack update failed.

Checking os-collect-config on the OC nodes I see the following error repeating:
os-collect-config[4857]: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x24c0710>: Failed to establish a new connection: [Errno 101] Network is unreachable',))

Comment 3 Marius Cornea 2016-09-12 10:05:50 UTC
(In reply to Alexander Chuzhoy from comment #0)
> Checking os-collect-config on the OC nodes I see the following error
> repeating:
> os-collect-config[4857]: HTTPConnectionPool(host='169.254.169.254',
> port=80): Max retries exceeded with url: /latest/meta-data/ (Caused by
> NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection
> object at 0x24c0710>: Failed to establish a new connection: [Errno 101]
> Network is unreachable',))

This usually indicates a network configuration issue, being unable to reach the metadata server. Could check to see if there is a route for the metadata server by 'ip r'. If there's no route for it I'd say something went wrong when os-net-config applied the network configuration. This route gets set as a static route via the EC2MetadataIp parameter.

Comment 4 James Slagle 2016-09-13 20:30:08 UTC
sasha, please check comment 3 for some initial troubleshooting to see if the issue is in the networking configuration/application

Comment 5 Alexander Chuzhoy 2016-09-15 14:58:07 UTC
I didn't reproduce the issue on the same setup.
Will close for now.
Thanks.