Bug 1830505

Summary: cluster-etcd-operator: peer cert DNS SAN does not contain domain wildcard.
Product: OpenShift Container Platform Reporter: Sam Batschelet <sbatsche>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: ge liu <geliu>
Severity: low Docs Contact:
Priority: low    
Version: 4.4CC: dmace, pasik, skolicha, wking
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1830510 (view as bug list) Environment:
Last Closed: 2020-07-13 17:34:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sam Batschelet 2020-05-02 10:43:46 UTC
Description of problem: In upgrade logs I observed 

> 2020-05-01T18:36:37.0791842Z 2020-05-01 18:36:37.079092 E | rafthttp: failed to dial d8027fcd63ed8f3f on stream MsgApp v2 (x509: certificate is valid for localhost, mffaz1.qe.azure.devcluster.openshift.com, 10.0.0.6, not etcd-0.mffaz1.qe.azure.devcluster.openshift.com)

This is a regression, in 4.3 peer and server certs both had wildcard.

https://github.com/openshift/machine-config-operator/blob/a8b6ec1b0c6cb544e6160ef2f65a7c2b59e6d199/pkg/controller/template/render.go#L382

while in 4.4 we only include the domain without wildcard.

X509v3 Subject Alternative Name: 
   DNS:localhost, DNS:mffaz1.qe.azure.devcluster.openshift.com, DNS:10.0.0.4, IP Address:10.0.0.4

This regression could affect upgrades.

Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce:
1.
2.
3.

Actual results: peer certs are missing *.etcdDiscoveryDomain wildcard in SAN


Expected results: etcd peers certs contain proper SAN 


Additional info:

Comment 4 Sam Batschelet 2020-05-07 21:51:16 UTC
reverting the change for 4.5 as it is not correct this change should only be needed in 4.4 to cover upgrades from 4.3 clusters.

Comment 5 Sam Batschelet 2020-05-07 22:15:49 UTC
lowering severity as this is being reverted

Comment 7 ge liu 2020-05-08 09:17:35 UTC
typo in comment6:

Verified in ocp 4.5 with 4.5.0-0.nightly-2020-05-06-003431, and checked in 4.4(4.4.0-0.nightly-2020-05-08-033144)which fix have not be merged into

Comment 8 W. Trevor King 2020-05-14 19:54:25 UTC
Back into POST so we can hang https://github.com/openshift/cluster-etcd-operator/pull/341 on this same bug.  Moving VERIFIED -> POST is cheating a bit, and is not a good idea when we are actively releasing the target branch, but we aren't releasing 4.5 yet, so cheating here is ok.

Comment 12 ge liu 2020-05-20 07:10:56 UTC
Close it, pls contact with me if any issue, thanks

Comment 13 errata-xmlrpc 2020-07-13 17:34:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409