Bug 1272201

Summary: Installer etcd server.crt is created with wrong X509v3 Subject Alternative Name
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: InstallerAssignee: Jason DeTiberus <jdetiber>
Status: CLOSED CURRENTRELEASE QA Contact: Ma xiaoqiang <xiama>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.0.0CC: aos-bugs, bleanhar, gpei, jialiu, jokerman, mmccomas, rhowe, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-20 15:41:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1267733    
Bug Blocks:    

Description Ryan Howe 2015-10-15 18:22:56 UTC
Description of problem:

Installer etcd server.crt is created with wrong X509v3 Subject Alternative Name when host has 2 network interfaces. The X509v3 Subject Alternative Name will take the ip address of the 1st interface listed even when the ip variables is defined 

Version-Release number of selected component (if applicable):
3.0.x

How reproducible:
Run ose install to install a clustered ectd
Define all variables for IP and etcd interface
Make sure the host has 2 network interfaces, use the 2nd on listed

Actual results:
San Identifier show IP for 1st interface


Expected results:
Cert will be populated with right information as defined in /etc/ansible/hosts file


Additional info:

Comment 3 Scott Dodson 2015-10-16 14:35:30 UTC
Potential fix is to move to using hostnames

https://github.com/openshift/openshift-ansible/pull/437

This was the way I was going with this but I think there's still some issues to sort out here

https://bugzilla.redhat.com/show_bug.cgi?id=1267740

Comment 4 Jason DeTiberus 2015-10-16 15:32:12 UTC
I've uncovered the issue.  We were using the openshift.common.ip (from openshift_facts) instead of using the ip address derived from etcd_interface when creating the server certificates.  I have a completely untested PR here: https://github.com/openshift/openshift-ansible/pull/702

Comment 5 Jason DeTiberus 2015-10-16 15:35:22 UTC
Ryan,

Can you test the above PR to see if it solves your issue?

Comment 6 Jason DeTiberus 2015-10-26 15:29:58 UTC
QE, please test with the situation that Ryan mentions above. The PR here: https://github.com/openshift/openshift-ansible/pull/702 should resolve the issue he is seeing.

Comment 7 Gaoyun Pei 2015-10-27 08:49:38 UTC
Hi Jason,

QE could reproduce this issue with environment as mentioned by Ryan. Wrong IP address was picked up as the SAN identifier in server.crt, but when testing with non eth0 network interface, BZ#1267733 was involved.

QE also tried with PR https://github.com/openshift/openshift-ansible/pull/702, ansible installer quit when creating the server csr for etcd cluster:

TASK: [etcd_certificates | Create the server csr] ***************************** 
fatal: [10.66.79.107] => Failed to template IP:{{ ansible_devices['ansible_' ~ etcd_interface]ipv4.address }}: template error while templating string: expected token 'end of print statement', got 'ipv4'

So I agree with Scott that using hostnames would finally resolve the two bugs. I'll set BZ#1267733 as blocker of this bug.

Comment 8 Jason DeTiberus 2015-10-27 14:48:30 UTC
I've updated https://github.com/openshift/openshift-ansible/pull/702 to address the issue you hit, I must have missed pushing a fix to the variable reference for creating the SAN, please re-test.

Comment 9 Jason DeTiberus 2015-10-27 14:55:44 UTC
Arrgh, I also referenced the variable on the local host instead of the variable on the host the certificate is being generated for, should be fixed now.

Comment 10 Gaoyun Pei 2015-10-28 05:50:17 UTC
Test with https://github.com/detiber/openshift-ansible.git -b etcd_cert_fix

Ansible installer aborted at:

TASK: [etcd_certificates | Create the server csr] ***************************** 
...
140190467962784:error:220A4076:X509 V3 routines:A2I_GENERAL_NAME:bad ip address:v3_alt.c:477:value={# item['ansible_' ~ item.etcd_interface].ipv4.address #}
140190467962784:error:22098080:X509 V3 routines:X509V3_EXT_nconf:error in extension:v3_conf.c:93:name=subjectAltName, value=IP:{# item['ansible_' ~ item.etcd_interface].ipv4.address #}

So move this to Assigned.

Comment 11 Jason DeTiberus 2015-10-28 19:38:50 UTC
I updated https://github.com/openshift/openshift-ansible/pull/702 to address some edge cases. We now test to make sure that we can read the variables we expect from the interface specified by etcd_interface for the specified host.

If you are still seeing issues, please update with the inventory file you are using as well as the output of `ip addr` on the hosts.

Comment 12 Gaoyun Pei 2015-10-29 05:17:34 UTC
Configure etcd_interface for the etcd hosts in the inventory file and test with https://github.com/detiber/openshift-ansible.git -b etcd_cert_fix

Ansible installer failed at:
 
TASK: [etcd_ca | command openssl req -config openssl.cnf -newkey rsa:4096 -keyout ca.key -new -out ca.crt -x509 -extensions etcd_v3_ca_self -batch -nodes -subj /CN=etcd-signer@{{ ansible_date_time.epoch }} -days 365
] *** 
fatal: [10.66.79.125] => Failed to template {{}}: template error while templating string: Expected an expression, got 'end of print statement'

Comment 14 Jason DeTiberus 2015-10-29 21:21:09 UTC
I updated the PR again with some sanity checking around the setting of etcd_interface. I also fixed the issue where it started to require etcd_interface be set.

It looks like something changed (either in ansible or RHEL) that is now causing setting an ansible environment variable to an empty string with the command module was causing some tasks to fail under certain circumstances, so I just explicitly set the variable to a value now.

Comment 15 Gaoyun Pei 2015-10-30 07:10:00 UTC
Test with https://github.com/openshift/openshift-ansible/pull/702 again, etcd ha cluster could be set up correctly with etcd_interface specified for the specified host. The IP address in SAN of server.crt is the address of specified etcd_interface.
 
So move this to VERIFIED.