Bug 1810997

Summary: KAS pod logs many: failed to connect to {https://[::1]:2379 0 <nil>}...x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting
Product: OpenShift Container Platform Reporter: Xingxing Xia <xxia>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: geliu, skolicha
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The etcd peer certificate doesn't include ipv6 localhost address Consequence: Failed to connect on https://[::1]:2379 messages Fix: Include the ::1 as one of the hosts in the peer certificate. Result: Repeated failed attempts to connect using https://[::1]:2379 are no longer shown.
Story Points: ---
Clone Of:
: 1812277 (view as bug list) Environment:
Last Closed: 2020-07-13 17:18:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1812277    

Description Xingxing Xia 2020-03-06 11:31:46 UTC
Description of problem:
KAS pod logs many: failed to connect to {https://[::1]:2379 0  <nil>}...x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-03-05-190442

How reproducible:
Created once 4.5 env, hit below

Steps to Reproduce:
1. Create fresh 4.5 env
2. After successfully created, check oc get co, po and node, all are well
3. Check KAS log
$ oc logs kube-apiserver-ip-10-0-139-77.us-east-2.compute.internal -n openshift-kube-apiserver -c kube-apiserver -f > kas-45.log

Actual results:
3. Many repeated below "failed to connected to ...:2379" logs:
...
E0306 11:27:19.232904       1 controller.go:114] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: OpenAPI spec does not exist
I0306 11:27:19.232937       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
W0306 11:27:23.252802       1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://[::1]:2379 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting...
W0306 11:27:23.867305       1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://[::1]:2379 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting...
W0306 11:27:25.241781       1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://[::1]:2379 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting...
W0306 11:27:31.682445       1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://[::1]:2379 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for 10.0.139.77, 127.0.0.1, not ::1". Reconnecting...
...

Expected results:
3. No such "failed" logs

Additional info:

Comment 7 errata-xmlrpc 2020-07-13 17:18:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409