Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1400746 - [3.4] Installing on AWS in Ohio (us-east-2c) fails
[3.4] Installing on AWS in Ohio (us-east-2c) fails
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.3.1
Unspecified Unspecified
medium Severity medium
: ---
: 3.4.z
Assigned To: Derek Carr
Johnny Liu
:
Depends On:
Blocks: 1406889
  Show dependency treegraph
 
Reported: 2016-12-01 20:21 EST by Christian Hernandez
Modified: 2017-01-31 15:19 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The us-east-2c, eu-west-2, ap-south-1, ca-central-1 AWS regions have been added to the product enabling cloud provider support for those regions.
Story Points: ---
Clone Of:
: 1406889 (view as bug list)
Environment:
Last Closed: 2017-01-31 15:19:15 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0218 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4.1.2 bug fix update 2017-01-31 20:18:20 EST

  None (edit)
Description Christian Hernandez 2016-12-01 20:21:38 EST
Description of problem:

Performing the install in AWS in Ohio (us-east-2c) fails at the step where it tries to start up the master. It gives the following error

```
Dec 01 18:03:35 ip-172-31-47-71.us-east-2.compute.internal atomic-openshift-master[22654]: F1201 18:03:35.234574   22654 start_master.go:103] could not init cloud provider "aws": not a valid AWS zone (unknown region): us-east-2c
```

The same ansible hosts file works in us-west-1

Version-Release number of selected component (if applicable):

[root@ip-172-31-13-56 ~]# ansible --version
ansible 2.2.0.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides

[root@ip-172-31-13-56 ~]# oc version
oc v3.3.1.5
kubernetes v1.3.0+52492b4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-31-13-56.us-west-1.compute.internal:8443
openshift v3.3.1.5
kubernetes v1.3.0+52492b4

[root@ip-172-31-13-56 ~]# rpm -q atomic-openshift-utils
atomic-openshift-utils-3.3.54-1.git.0.61a1dee.el7.noarch


How reproducible:

All the time

Steps to Reproduce:
1. Spin up instances in us-east-2c (Ohio)
2. Go through the prereq/host preparation steps
3. Install OpenShift using aws cloud provider profile

Actual results:

Install Fails

Expected results:

Install Succeeds

Additional info:

Ansible Host file - https://paste.fedoraproject.org/495276/41658148/
Comment 1 Brenton Leanhardt 2016-12-02 09:03:00 EST
Is the region actually us-east-2c?  I think that is an availability zone.
Comment 2 Christian Hernandez 2016-12-02 12:04:48 EST
The region is actually us-east-2 ...but for some reason in the /etc/origin/cloudconfig/aws.conf it says us-east-2c
Comment 3 Brenton Leanhardt 2016-12-02 13:08:29 EST
Just to be clear are you setting the 'Zone' as in https://docs.openshift.com/container-platform/3.3/install_config/configuring_aws.html ?

I'd be curious to know if us-east-1c works so we could direct this bug accordingly.
Comment 4 Christian Hernandez 2016-12-02 13:20:55 EST
I am setting it up using the ansbile hosts file; so it's whatever the installer populates that file with.

I'm going to test it in us-east-1c and report to see if it works.
Comment 5 Brenton Leanhardt 2016-12-02 13:51:05 EST
Just to make sure there isn't something stale in the openshift ansible facts, can you confirm which zone your host is actually in?  I see from the hostname that it's in the us-east-2 region (so it probably doesn't make sense to override aws.conf with us-east-1c).
Comment 6 Christian Hernandez 2016-12-02 14:39:28 EST
So I got similar in us-east-1 as I did in us-east-2

```
TASK [openshift_master : Start and enable master] ******************************
FAILED - RETRYING: TASK: openshift_master : Start and enable master (1 retries left).
fatal: [ip-172-31-184-203.ec2.internal]: FAILED! => {
    "attempts": 1, 
    "changed": false, 
    "failed": true
}

MSG:

Unable to start service atomic-openshift-master: Job for atomic-openshift-master.service failed because the control process exited with error code. See "systemctl status atomic-openshift-master.service" and "journalctl -xe" for details.

```

However starting it manually I get more info in us-east-1

```
[root@ip-172-31-184-203 ~]# /usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS
I1202 14:37:39.053311   16045 admission.go:99] Admission plugin ProjectRequestLimit is not enabled.  It will not be started.
I1202 14:37:39.053335   16045 admission.go:99] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
I1202 14:37:39.053384   16045 admission.go:99] Admission plugin RunOnceDuration is not enabled.  It will not be started.
I1202 14:37:39.053401   16045 admission.go:99] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
I1202 14:37:39.053408   16045 admission.go:99] Admission plugin ClusterResourceOverride is not enabled.  It will not be started.
I1202 14:37:39.053421   16045 admission.go:99] Admission plugin openshift.io/ImagePolicy is not enabled.  It will not be started.
I1202 14:37:39.053510   16045 admission.go:99] Admission plugin BuildOverrides is not enabled.  It will not be started.
I1202 14:37:39.053517   16045 admission.go:99] Admission plugin AlwaysPullImages is not enabled.  It will not be started.
E1202 14:37:39.059925   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.ClusterPolicy: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.060093   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.ClusterPolicyBinding: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.060147   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.Policy: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.060267   16045 reflector.go:214] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:103: Failed to list *api.ServiceAccount: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/serviceaccounts?resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
E1202 14:37:39.060273   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.PolicyBinding: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.060402   16045 reflector.go:203] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/namespace/lifecycle/admission.go:141: Failed to list *api.Namespace: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/namespaces?resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
E1202 14:37:39.060404   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.Group: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.060614   16045 reflector.go:203] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:154: Failed to list *api.LimitRange: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/limitranges?resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
E1202 14:37:39.060706   16045 reflector.go:214] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/serviceaccount/admission.go:119: Failed to list *api.Secret: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/secrets?fieldSelector=type%3Dkubernetes.io%2Fservice-account-token&resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
E1202 14:37:39.060737   16045 reflector.go:203] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:154: Failed to list *api.LimitRange: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/limitranges?resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
E1202 14:37:39.061369   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.OAuthAccessToken: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.061405   16045 cacher.go:220] unexpected ListAndWatch error: pkg/storage/cacher.go:163: Failed to list *api.User: client: etcd cluster is unavailable or misconfigured
E1202 14:37:39.061428   16045 reflector.go:214] github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/resourcequota/resource_access.go:83: Failed to list *api.ResourceQuota: Get https://ip-172-31-184-203.ec2.internal:8443/api/v1/resourcequotas?resourceVersion=0: dial tcp: lookup ip-172-31-184-203.ec2.internal on 172.31.0.2:53: no such host
F1202 14:37:39.061672   16045 start_master.go:103] could not init cloud provider "aws": error finding instance i-faa259ed: error listing AWS instances: NoCredentialProviders: no valid providers in chain
```
Comment 7 Brenton Leanhardt 2016-12-02 14:56:03 EST
NoCredentialProviders seems to be relevant.  Do you not see that error when you are using us-east-2c if you start the service manually?  If you do see it I would wonder if this is the root cause.
Comment 8 Brenton Leanhardt 2016-12-02 15:04:00 EST
Also, I'm still curious if your systems are actually in us-east-2c or a different zone in the us-east-2 region.
Comment 9 Christian Hernandez 2016-12-02 15:13:27 EST
Let me see what us-east-2 says...

This is what us-east-1 says

curl http://169.254.169.254/latest/dynamic/instance-identity/document
{
  "devpayProductCodes" : null,
  "accountId" : "701119495576",
  "availabilityZone" : "us-east-1b",
  "privateIp" : "172.31.184.203",
  "version" : "2010-08-31",
  "instanceId" : "i-faa259ed",
  "billingProducts" : [ "bp-6fa54006" ],
  "instanceType" : "t2.large",
  "pendingTime" : "2016-12-02T18:58:48Z",
  "architecture" : "x86_64",
  "imageId" : "ami-b63769a1",
  "kernelId" : null,
  "ramdiskId" : null,
  "region" : "us-east-1"
}
Comment 10 Christian Hernandez 2016-12-02 15:55:41 EST
Just tried it in us-east-2 again and same issue.

```
TASK [openshift_master : Start and enable master] ******************************
FAILED - RETRYING: TASK: openshift_master : Start and enable master (1 retries left).
fatal: [ip-172-31-33-216.us-east-2.compute.internal]: FAILED! => {
    "attempts": 1, 
    "changed": false, 
    "failed": true
}

MSG:
```

Similar output

```
[ec2-user@ip-172-31-33-216 ~]$ sudo /usr/bin/openshift start master --config=${CONFIG_FILE} $OPTIONS
I1202 15:52:44.982303   20002 admission.go:99] Admission plugin ProjectRequestLimit is not enabled.  It will not be started.
I1202 15:52:44.982329   20002 admission.go:99] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
I1202 15:52:44.982378   20002 admission.go:99] Admission plugin RunOnceDuration is not enabled.  It will not be started.
I1202 15:52:44.982398   20002 admission.go:99] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
I1202 15:52:44.982405   20002 admission.go:99] Admission plugin ClusterResourceOverride is not enabled.  It will not be started.
I1202 15:52:44.982419   20002 admission.go:99] Admission plugin openshift.io/ImagePolicy is not enabled.  It will not be started.
I1202 15:52:44.982473   20002 admission.go:99] Admission plugin BuildOverrides is not enabled.  It will not be started.
I1202 15:52:44.982479   20002 admission.go:99] Admission plugin AlwaysPullImages is not enabled.  It will not be started.
F1202 15:52:44.983933   20002 start_master.go:103] could not init cloud provider "aws": not a valid AWS zone (unknown region): us-east-2c
```
Versions again

```
[ec2-user@ip-172-31-33-216 ~]$ oc version
oc v3.3.1.5
kubernetes v1.3.0+52492b4
features: Basic-Auth GSSAPI Kerberos SPNEGO
[ec2-user@ip-172-31-33-216 ~]$ ansible --version
ansible 2.2.0.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides
[ec2-user@ip-172-31-33-216 ~]$ rpm -q atomic-openshift-utils
atomic-openshift-utils-3.3.54-1.git.0.61a1dee.el7.noarch
```

Checked to see if I'm in the right place

```
[ec2-user@ip-172-31-33-216 ~]$ curl http://169.254.169.254/latest/dynamic/instance-identity//document
{
  "devpayProductCodes" : null,
  "privateIp" : "172.31.33.216",
  "availabilityZone" : "us-east-2c",
  "accountId" : "701119495576",
  "version" : "2010-08-31",
  "instanceId" : "i-0f5caa105f4f0a461",
  "billingProducts" : [ "bp-6fa54006" ],
  "instanceType" : "t2.large",
  "pendingTime" : "2016-12-02T20:15:40Z",
  "imageId" : "ami-0932686c",
  "architecture" : "x86_64",
  "kernelId" : null,
  "ramdiskId" : null,
  "region" : "us-east-2"
}
```
Comment 11 Scott Dodson 2016-12-05 10:33:38 EST
Needs to backport https://github.com/kubernetes/kubernetes/pull/35013 and possibly others, re-assigning to kube team.
Comment 18 Johnny Liu 2017-01-16 00:42:46 EST
Re-test this bug with the latest build - atomic-openshift-3.4.0.39-1.git.0.5f32f06.el7.x86_64, still failed.


# journalctl -f  -u atomic-openshift-master
<--snip-->
Jan 16 00:38:34 ip-172-31-37-46.us-east-2.compute.internal atomic-openshift-master[23090]: I0116 00:38:34.845004   23090 aws.go:745] Building AWS cloudprovider
Jan 16 00:38:34 ip-172-31-37-46.us-east-2.compute.internal atomic-openshift-master[23090]: F0116 00:38:34.845068   23090 start_master.go:108] could not init cloud provider "aws": not a valid AWS zone (unknown region): us-east-2c
<--snip-->

Seem like the PR is not merged into 3.4 rpm package.
Comment 20 Johnny Liu 2017-01-22 01:08:14 EST
Verified this bug with atomic-openshift-3.4.1.0-1.git.0.9e8d48b.el7.x86_64, and PASS.

Cluster with cloudprovider enabled is set up successfully in AWS Ohio region.
Comment 22 errata-xmlrpc 2017-01-31 15:19:15 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0218

Note You need to log in before you can comment on or make changes to this bug.