Bug 1506537 - Provisioning OCP on AWS failed due to SSLCertificateId missed
Summary: Provisioning OCP on AWS failed due to SSLCertificateId missed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.7.0
Assignee: Kenny Woodson
QA Contact: Wenkai Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-26 09:41 UTC by Wenkai Shi
Modified: 2017-11-28 22:19 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-28 22:19:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Wenkai Shi 2017-10-26 09:41:05 UTC
Description of problem:
Follow steps in [1] to provisioning OCP. It failed during creating ELB.

[1]. /usr/share/ansible/openshift-ansible/playbooks/aws/README.md

Version-Release number of the following components:
atomic-openshift-utils-3.7.0-0.178.0.git.0.27a1039.el7
ansible-2.4.0.0-5.el7

How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:
# ansible-playbook -i inventory.yml /usr/share/ansible/openshift-ansible/playbooks/aws/openshift-cluster/provision_install.yml -e @provisioning_vars.yml -vvv
...
TASK [openshift_aws : Create ELB weshi-cluster-master-external] ******************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_aws/tasks/elb.yml:32
Using module file /usr/lib/python2.7/site-packages/ansible/modules/cloud/amazon/_ec2_elb_lb.py
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~ && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111 `" && echo ansible-tmp-1509006543.09-179931132085111="` echo /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111 `" ) && sleep 0'
<127.0.0.1> PUT /tmp/tmpVsSC3D TO /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111/_ec2_elb_lb.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111/ /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111/_ec2_elb_lb.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111/_ec2_elb_lb.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1509006543.09-179931132085111/" > /dev/null 2>&1 && sleep 0'
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py", line 1372, in <module>
    main()
  File "/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py", line 1359, in main
    elb_man.ensure_ok()
  File "/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py", line 416, in _do_op
    return op(*args, **kwargs)
  File "/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py", line 491, in ensure_ok
    self._create_elb()
  File "/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py", line 723, in _create_elb
    scheme=self.scheme)
  File "/usr/lib/python2.7/site-packages/boto/ec2/elb/__init__.py", line 242, in create_load_balancer
    params, LoadBalancer)
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1208, in get_object
    raise self.ResponseError(response.status, response.reason, body)
boto.exception.BotoServerError: BotoServerError: 400 Bad Request
<ErrorResponse xmlns="http://elasticloadbalancing.amazonaws.com/doc/2012-06-01/">
  <Error>
    <Type>Sender</Type>
    <Code>ValidationError</Code>
    <Message>Secure Listeners need to specify a SSLCertificateId</Message>
  </Error>
  <RequestId>ba403395-ba27-11e7-8b6c-f3b27607c9ef</RequestId>
</ErrorResponse>


fatal: [localhost]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py\", line 1372, in <module>\n    main()\n  File \"/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py\", line 1359, in main\n    elb_man.ensure_ok()\n  File \"/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py\", line 416, in _do_op\n    return op(*args, **kwargs)\n  File \"/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py\", line 491, in ensure_ok\n    self._create_elb()\n  File \"/tmp/ansible_oHxxM6/ansible_module_ec2_elb_lb.py\", line 723, in _create_elb\n    scheme=self.scheme)\n  File \"/usr/lib/python2.7/site-packages/boto/ec2/elb/__init__.py\", line 242, in create_load_balancer\n    params, LoadBalancer)\n  File \"/usr/lib/python2.7/site-packages/boto/connection.py\", line 1208, in get_object\n    raise self.ResponseError(response.status, response.reason, body)\nboto.exception.BotoServerError: BotoServerError: 400 Bad Request\n<ErrorResponse xmlns=\"http://elasticloadbalancing.amazonaws.com/doc/2012-06-01/\">\n  <Error>\n    <Type>Sender</Type>\n    <Code>ValidationError</Code>\n    <Message>Secure Listeners need to specify a SSLCertificateId</Message>\n  </Error>\n  <RequestId>ba403395-ba27-11e7-8b6c-f3b27607c9ef</RequestId>\n</ErrorResponse>\n\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE", 
    "rc": 0
}
	to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/aws/openshift-cluster/provision_install.retry

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost                  : ok=32   changed=3    unreachable=0    failed=1   

Expected results:
ELB could create successfully 

Additional info:
Seems ssl_certificate_id has been set to '':

$ grep -nir "openshift_aws_elb_cert_arn" .
./roles/openshift_aws/defaults/main.yml:54:openshift_aws_elb_cert_arn: ''
./roles/openshift_aws/defaults/main.yml:68:      ssl_certificate_id: "{{ openshift_aws_elb_cert_arn }}"
./roles/openshift_aws/tasks/iam_cert.yml:19:  - openshift_aws_elb_cert_arn == ''
./roles/openshift_aws/tasks/iam_cert.yml:23:- name: set_fact openshift_aws_elb_cert_arn
./roles/openshift_aws/tasks/iam_cert.yml:25:    openshift_aws_elb_cert_arn: "{{ elb_cert_chain.arn }}"
./roles/openshift_aws/tasks/iam_cert.yml:30:  - openshift_aws_elb_cert_arn == ''

Comment 2 Kenny Woodson 2017-10-26 13:57:18 UTC
Wenkai,

Thanks for reporting this.

To understand what is happening here let's step back and understand the setup of how the infrastructure is created.

When installing in AWS, we use an ELB to front the master API externally and internally.  We also front the router nodes (infra nodes) with an ELB.

The external ELB will take traffic for the console and api.  The defined listeners required on the ELB are defined here: https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_aws/defaults/main.yml#L56-L77

We require that the ELB support SSL traffic. This requires a certificate to be placed on the ELB.

The first step is to create the certificate.  This happens inside of https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_aws/tasks/provision.yml.

The certificate actually gets created here:
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_aws/tasks/iam_cert.yml

Then once this has been created, we store the ARN (amazon resource name) inside of openshift_aws_elb_cert_arn so that later it can be placed on the master external ELB.

Now that we understand the process of why we require an SSL certificate for the external master ELB we can start to troubleshoot why we are seeing this error.

The error you are running into is that the variable openshift_aws_elb_cert_arn is defaulted to an empty string.  This means that the variable was not set during the iam_cert.yml set_fact call. This happens here:
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_aws/tasks/iam_cert.yml#L25

Are you passing in the following variables?
openshift_aws_iam_cert_path
openshift_aws_iam_cert_key_path
openshift_aws_iam_cert_chain_path

My provisioning_vars.yml includes these variables:
---
openshift_aws_iam_cert_path: '/home/kwoodson/certificates/files/opstest/wildcard.opstest.openshift.com.crt'
openshift_aws_iam_cert_key_path: '/home/kwoodson/certificates/files/opstest/wildcard.opstest.openshift.com.key'
openshift_aws_iam_cert_chain_path: '/home/kwoodson/certificates/files/digicert.ca.crt'
---

These can be self signed or official certificates.  Once created and passed into the provisioning, the certificates will be placed on the ELB properly.

Comment 3 Kenny Woodson 2017-10-26 17:29:59 UTC
I also noticed this variable name needs to be fixed.  Here is a PR for the variable name: https://github.com/openshift/openshift-ansible/pull/5900

Comment 4 Wenkai Shi 2017-10-27 02:59:59 UTC
Hi Kenny,

My provisioning_vars.yml includes these variables:
...
openshift_aws_iam_cert_path: '/root/named.cert/wildcard.weshi-cluster.example.com.crt'
openshift_aws_iam_key_path: '/root/named.cert/wildcard.weshi-cluster.example.com.key'
openshift_aws_iam_cert_chain_path: '/root/named.cert/rootCA.pem'
...

It could works since change the variable name to "openshift_aws_iam_cert_key_path", will verify this when PR merged.

BY the way, if "openshift_aws_iam_cert_chain_path" necessary, I think it should be mentioned on "playbooks/aws/README.md".

Comment 5 Wenkai Shi 2017-10-27 05:57:45 UTC
Code already merged to openshift-ansible-3.7.0-0.182.0.git.0.23a42dc.el7.

# cat /usr/share/ansible/openshift-ansible/playbooks/aws/provisioning_vars.yml.example
...
# custom certificates are required for the ELB
openshift_aws_iam_cert_path: # '/path/to/wildcard.<clusterid>.example.com.crt'
openshift_aws_iam_cert_key_path: # '/path/to/wildcard.<clusterid>.example.com.key'
openshift_aws_iam_cert_chain_path: # '/path/to/cert.ca.crt'

Comment 6 Kenny Woodson 2017-10-31 19:11:55 UTC
Wenkai,

https://github.com/openshift/openshift-ansible/pull/5957

Comment 8 Wenkai Shi 2017-11-02 02:34:07 UTC
Verified with version openshift-ansible-3.7.0-0.189.0.git.0.d497c5e.el7, the
variable name fixed.

# cat /usr/share/ansible/openshift-ansible/playbooks/aws/provisioning_vars.yml.example
...
# custom certificates are required for the ELB
openshift_aws_iam_cert_path: # '/path/to/wildcard.<clusterid>.example.com.crt'
openshift_aws_iam_cert_key_path: # '/path/to/wildcard.<clusterid>.example.com.key'
openshift_aws_iam_cert_chain_path: # '/path/to/cert.ca.crt'

Comment 11 errata-xmlrpc 2017-11-28 22:19:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.