Bug 1372469 - multi master deployment places etcdctl certs on master instead of etcd host
Summary: multi master deployment places etcdctl certs on master instead of etcd host
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 3.2.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Vikram Goyal
QA Contact: Vikram Goyal
Vikram Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-01 19:53 UTC by Dave Sullivan
Modified: 2017-06-14 13:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-14 13:48:33 UTC
Target Upstream Version:


Attachments (Terms of Use)
ansible install log (1.32 MB, text/plain)
2016-09-01 20:00 UTC, Dave Sullivan
no flags Details

Description Dave Sullivan 2016-09-01 19:53:04 UTC
Description of problem:

https://access.redhat.com/documentation/en/openshift-enterprise/version-3.2/installation-and-configuration/#multiple-masters

Using ansible hosts inventory file noted in 2.4 incorrectly puts etcdctl certs on master nodes instead of etcd nodes

Also the documentation in section 2.5.7 verification tells you to run etcdctl from the master which doesn't make sense because etcd is installed on the defined etcd nodes not the master





Version-Release number of selected component (if applicable):


How reproducible:

Create separate systems similar to following host inventory file


[root@m01-useast1a-c001 ~]# cat /etc/ansible/hosts
# Create an OSEv3 group that contains the master, nodes, etcd, and lb groups.
# The lb group lets Ansible configure HAProxy as the load balancing solution.
# Comment lb out if your load balancer is pre-configured.
[OSEv3:children]
masters
nodes
etcd
lb

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
ansible_ssh_user=root
deployment_type=openshift-enterprise

# Uncomment the following to enable htpasswd authentication; defaults to
# DenyAllPasswordIdentityProvider.
#openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

# Native high availbility cluster method with optional load balancer.
# If no lb group is defined installer assumes that a load balancer has
# been preconfigured. For installation the value of
# openshift_master_cluster_hostname must resolve to the load balancer
# or to one or all of the masters defined in the inventory if no load
# balancer is present.
openshift_master_cluster_method=native
openshift_master_cluster_hostname=c001-useast1a.ose.sullyvon.com
openshift_master_cluster_public_hostname=c001-useast1a.ose.sullyvon.com

# override the default controller lease ttl
#osm_controller_lease_ttl=30

# host group for masters
[masters]
m01-useast1a-c001.ose.sullyvon.com
m02-useast1a-c001.ose.sullyvon.com
m03-useast1a-c001.ose.sullyvon.com

# host group for etcd
[etcd]
e01-useast1a-c001.ose.sullyvon.com
e02-useast1a-c001.ose.sullyvon.com
e03-useast1a-c001.ose.sullyvon.com

# Specify load balancer host
[lb]
lb01-useast1a-c001.ose.sullyvon.com

# host group for nodes, includes region info
[nodes]
m01-useast1a-c001.ose.sullyvon.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=false
m02-useast1a-c001.ose.sullyvon.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=false
m03-useast1a-c001.ose.sullyvon.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=false
n01-useast1a-c001.ose.sullyvon.com openshift_node_labels="{'region': 'primary', 'zone': 'east'}"
n02-useast1a-c001.ose.sullyvon.com openshift_node_labels="{'region': 'primary', 'zone': 'east'}"

Do the Advanced Ansible Installation (see attached install log)


Steps to Reproduce:
1.
2.
3.

Actual results:

[root@m01-useast1a-c001 ~]# which etcdctl
/usr/bin/which: no etcdctl in (/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)

[root@m01-useast1a-c001 ~]# ls /etc/origin/master/ | grep etcd
etcd.server.crt
etcd.server.key
master.etcd-ca.crt
master.etcd-client.crt
master.etcd-client.csr
master.etcd-client.key

There is no /etc/origin directory on the etcd hosts

[root@e02-useast1a-c001 ec2-user]# cd /etc/origin
bash: cd: /etc/origin: No such file or directory

Expected results:

If there is system separation of etcd, i.e. etcd is not installed on the same systems as the masters then the certs need to be placed on the etcd nodes accordingly, and it's origin really the right location?

Additional info:

One can rsync the certs from the master nodes to the etcd nodes and run etcdctl just fine from the etcd nodes

[root@e01-useast1a-c001 ~]# hostname
e01-useast1a-c001.ose.sullyvon.com
[root@e01-useast1a-c001 ~]# cat etcd_check.sh 
#!/bin/bash

etcdctl -C \
    https://10.0.2.9:2379 \
    --ca-file=/etc/origin/master/master.etcd-ca.crt \
    --cert-file=/etc/origin/master/master.etcd-client.crt \
    --key-file=/etc/origin/master/master.etcd-client.key cluster-health

etcdctl -C \
    https://10.0.2.9:2379 \
    --ca-file=/etc/origin/master/master.etcd-ca.crt \
    --cert-file=/etc/origin/master/master.etcd-client.crt \
    --key-file=/etc/origin/master/master.etcd-client.key member list
[root@e01-useast1a-c001 ~]# sh etcd_check.sh 
member fd267e59c8a8dbb is healthy: got healthy result from https://10.0.2.32:2379
member 83604d68922c7ccd is healthy: got healthy result from https://10.0.2.9:2379
member b946b00eebc61494 is healthy: got healthy result from https://10.0.2.194:2379
cluster is healthy
fd267e59c8a8dbb: name=10.0.2.32 peerURLs=https://10.0.2.32:2380 clientURLs=https://10.0.2.32:2379 isLeader=true
83604d68922c7ccd: name=10.0.2.9 peerURLs=https://10.0.2.9:2380 clientURLs=https://10.0.2.9:2379 isLeader=false
b946b00eebc61494: name=10.0.2.194 peerURLs=https://10.0.2.194:2380 clientURLs=https://10.0.2.194:2379 isLeader=false

Comment 1 Dave Sullivan 2016-09-01 20:00:18 UTC
Created attachment 1196947 [details]
ansible install log

Comment 3 Scott Dodson 2016-09-02 17:51:30 UTC
This will only be a problem when etcd hosts are not the same as the master hosts. A workaround for this is to install etcdctl on the master and use it there.

Comment 4 Brenton Leanhardt 2016-09-06 15:13:30 UTC
At a minimum we need to document how handle environments where etcd is not running on the masters.

Comment 5 Scott Dodson 2017-02-02 15:35:34 UTC
Moving down to medium, workaround in comment 3.

Comment 6 Scott Dodson 2017-06-09 02:24:13 UTC
Docs PR to mention ensuring that etcd is installed

https://github.com/openshift/openshift-docs/pull/4560

Also, since this bug has been filed there are now two helper functions that are deployed on etcd hosts named `etcdctl2` and `etcdctl3` which call `etcdctl` with the appropriate flags for the cert locations on the etcd hosts. However I think the documentation is better to have the steps mentioned executed on the master.

Moving to docs component.

Comment 7 Ashley Hardin 2017-06-09 14:30:08 UTC
Docs PR is merged


Note You need to log in before you can comment on or make changes to this bug.