Bug 1562322

Summary: HA etcd installation failed when "Validate permissions on certificate files"
Product: OpenShift Container Platform Reporter: Gaoyun Pei <gpei>
Component: InstallerAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: Gaoyun Pei <gpei>
Severity: high Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, haowang, jokerman, mmccomas, vrutkovs, wmeng
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:11:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gaoyun Pei 2018-03-30 07:20:19 UTC
Description of problem:
Installation failed for ocp-3.10 with ha etcd as following:

TASK [etcd : Validate permissions on certificate files] ************************
Friday 30 March 2018  01:52:02 -0400 (0:00:00.168)       0:01:47.615 ********** 
failed: [ec2-52-71-255-110.compute-1.amazonaws.com] (item=/etc/etcd/ca.crt) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/ca.crt", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/ca.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1895, "state": "file", "uid": 0}
changed: [ec2-54-152-193-176.compute-1.amazonaws.com] => (item=/etc/etcd/ca.crt) => {"changed": true, "failed": false, "gid": 993, "group": "etcd", "item": "/etc/etcd/ca.crt", "mode": "0600", "owner": "etcd", "path": "/etc/etcd/ca.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1895, "state": "file", "uid": 996}
failed: [ec2-184-73-51-181.compute-1.amazonaws.com] (item=/etc/etcd/ca.crt) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/ca.crt", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/ca.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1895, "state": "file", "uid": 0}
failed: [ec2-52-71-255-110.compute-1.amazonaws.com] (item=/etc/etcd/server.crt) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/server.crt", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/server.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 5933, "state": "file", "uid": 0}
changed: [ec2-54-152-193-176.compute-1.amazonaws.com] => (item=/etc/etcd/server.crt) => {"changed": true, "failed": false, "gid": 993, "group": "etcd", "item": "/etc/etcd/server.crt", "mode": "0600", "owner": "etcd", "path": "/etc/etcd/server.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 5933, "state": "file", "uid": 996}
failed: [ec2-184-73-51-181.compute-1.amazonaws.com] (item=/etc/etcd/server.crt) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/server.crt", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/server.crt", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 5936, "state": "file", "uid": 0}
failed: [ec2-52-71-255-110.compute-1.amazonaws.com] (item=/etc/etcd/server.key) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/server.key", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/server.key", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1704, "state": "file", "uid": 0}
changed: [ec2-54-152-193-176.compute-1.amazonaws.com] => (item=/etc/etcd/server.key) => {"changed": true, "failed": false, "gid": 993, "group": "etcd", "item": "/etc/etcd/server.key", "mode": "0600", "owner": "etcd", "path": "/etc/etcd/server.key", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1704, "state": "file", "uid": 996}
failed: [ec2-184-73-51-181.compute-1.amazonaws.com] (item=/etc/etcd/server.key) => {"changed": false, "failed": true, "gid": 0, "group": "root", "item": "/etc/etcd/server.key", "mode": "0644", "msg": "chown failed: failed to look up user etcd", "owner": "root", "path": "/etc/etcd/server.key", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 1704, "state": "file", "uid": 0}



Only the first etcd host has the expected permissions on certificate files:
[root@ip-172-18-14-12 ~]# ls -al /etc/etcd/
total 52
drwx------.  4 root root  172 Mar 30 01:51 .
drwxr-xr-x. 82 root root 8192 Mar 30 01:51 ..
drwx------.  5 root root  212 Mar 30 01:51 ca
-rw-------.  1 etcd etcd 1895 Mar 30 01:51 ca.crt
-rw-r--r--.  1 root root 1686 Jan 29 07:57 etcd.conf
drwx------.  5 root root  285 Mar 30 01:52 generated_certs
-rw-r--r--.  1 root root 5976 Mar 30 01:51 peer.crt
-rw-r--r--.  1 root root 1041 Mar 30 01:51 peer.csr
-rw-r--r--.  1 root root 1704 Mar 30 01:51 peer.key
-rw-------.  1 etcd etcd 5933 Mar 30 01:51 server.crt
-rw-r--r--.  1 root root 1041 Mar 30 01:51 server.csr
-rw-------.  1 etcd etcd 1704 Mar 30 01:51 server.key


For the other two etcd host:
[root@ip-172-18-15-205 ~]# ls -al /etc/etcd/
total 48
drwx------.  3 root root  132 Mar 30 01:52 .
drwxr-xr-x. 82 root root 8192 Mar 30 01:51 ..
drwxr-xr-x.  2 root root    6 Mar 30 01:52 ca
-rw-r--r--.  1 root root 1895 Mar 30 01:51 ca.crt
-rw-r--r--.  1 root root 5983 Mar 30 01:51 peer.crt
-rw-r--r--.  1 root root 1041 Mar 30 01:51 peer.csr
-rw-r--r--.  1 root root 1704 Mar 30 01:51 peer.key
-rw-r--r--.  1 root root 5936 Mar 30 01:51 server.crt
-rw-r--r--.  1 root root 1041 Mar 30 01:51 server.csr
-rw-r--r--.  1 root root 1704 Mar 30 01:51 server.key



Version-Release number of the following components:
openshift-ansible-3.10.0-0.15.0.git.0.556ddbb.el7.noarch.rpm
ansible 2.4.4-0.2.rc1.el7ae

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

Comment 3 Vadim Rutkovsky 2018-04-04 12:52:12 UTC
Created https://github.com/openshift/openshift-ansible/pull/7770 to fix this

Comment 4 Vadim Rutkovsky 2018-04-09 07:38:22 UTC
Fix is in openshift-ansible-3.10.0-0.16.0

Comment 5 Gaoyun Pei 2018-04-09 08:03:03 UTC
Verify this bug with openshift-ansible-3.10.0-0.16.0.git.0.8925606.el7.noarch.rpm.

For ha etcd cluster installation, this step passed. And etcd cluster is running well.

Comment 7 errata-xmlrpc 2018-07-30 19:11:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816