Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1497150 - atomic-openshift-node randomly failed on AWS due to AWS credentials not set
atomic-openshift-node randomly failed on AWS due to AWS credentials not set
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
high Severity medium
: ---
: 3.7.0
Assigned To: Michael Gugino
Gan Huang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-29 06:25 EDT by Gan Huang
Modified: 2017-11-28 17:13 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-28 17:13:46 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-28 21:34:54 EST

  None (edit)
Description Gan Huang 2017-09-29 06:25:10 EDT
Description of problem:
Installation failed on AWS while enabling cloudprovider. Dig more, found that the root cause of atomic-openshift-node failure was `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` not set in /etc/sysconfig/atomic-openshift-node.

TASK [openshift_node : Abort if node failed to start] **************************
Friday 29 September 2017  09:37:13 +0000 (0:00:01.155)       0:19:38.142 ****** 
fatal: [ec2-54-89-99-146.compute-1.amazonaws.com]: FAILED! => {"changed": false, "failed": true, "msg": "Node failed to start please inspect the logs and try again"}

Check the logs:
Sep 29 05:36:46 ip-172-18-11-50.ec2.internal atomic-openshift-node[30618]: I0929 05:36:46.462338   30627 aws.go:806] Building AWS cloudprovider
Sep 29 05:36:46 ip-172-18-11-50.ec2.internal atomic-openshift-node[30618]: F0929 05:36:46.464669   30627 start_node.go:141] could not init cloud provider "aws": error finding instance i-054489e66654e2cc8: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated. \\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""
Sep 29 05:36:46 ip-172-18-11-50.ec2.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a
"log" 6048L, 975083C

[root@ip-172-18-11-50 ~]# cat /etc/sysconfig/atomic-openshift-node
OPTIONS=--loglevel=2
CONFIG_FILE=/etc/origin/node/node-config.yaml
IMAGE_VERSION=v3.7.0

Version-Release number of the following components:
ansible 2.3
openshift-ansible-3.7.0-0.128.0.git.0.89dcad2.el7.noarch.rpm

How reproducible:
sometimes

Steps to Reproduce:
1. Trigger installation against AWS with cloudprovicer enabled.
2.
3.

Actual results:
See above

Expected results:

Additional info:
This issue was randomly happen due to https://github.com/ansible/ansible/issues/24450

Introduced by https://github.com/openshift/openshift-ansible/pull/5230 that the task "Start and enable node" is executed prior to `Configure AWS Cloud Provider Settings`.
Comment 1 Michael Gugino 2017-10-02 19:02:13 EDT
PR Submitted: https://github.com/openshift/openshift-ansible/pull/5633
Comment 2 Michael Gugino 2017-10-05 11:59:25 EDT
PR merged.
Comment 4 Gan Huang 2017-10-11 04:57:50 EDT
Verified with openshift-ansible-3.7.0-0.147.0.git.0.2fb41ee.el7.noarch.rpm
Comment 8 errata-xmlrpc 2017-11-28 17:13:46 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.