Bug 1843565
Summary: | ETCD fails to approve CSRs after hostname on a master is changed | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Peter Kirkpatrick <pkirkpat> |
Component: | Etcd | Assignee: | Sam Batschelet <sbatsche> |
Status: | CLOSED NOTABUG | QA Contact: | ge liu <geliu> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.3.z | CC: | psundara, zyu |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | s390x | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-06-03 20:04:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Kirkpatrick
2020-06-03 14:45:56 UTC
So help me understand what happens here. here is your failure[1]. ETCD_DNS_NAME is populated by doing an SRV query against the cluster domain[2]. The certs on disk are assumed to have the naming for example system:etcd-peer:${ETCD_DNS_NAME}.key why does that check fail? > Message: ing before flag.Parse: E0603 05:42:26.134794 9 agent.go:116] error sending CSR to signer: certificatesigningrequests.certificates.k8s.io "system:etcd-server:etcd-1.qe1-s390x.psi.redhat.com" already exists This is expected although not optimal. etcd certs in 4.3 are minted during bootstrap. New nodes or changes to nodes such as IP can invalidate assumptions baked into TLS SAN. If you make a CSR request it is because either no certs exist, this would happen if the node is new. In 4.3 a new node would require disaster recovery process to replace failed master node. [1] https://github.com/openshift/machine-config-operator/blob/release-4.3/templates/master/00-master/_base/files/etc-kubernetes-manifests-etcd-member.yaml#L39 [2] https://github.com/openshift/machine-config-operator/blob/release-4.3/cmd/setup-etcd-environment/run.go#L63 Turns out there was an issue with the network when fixed everything came up fine. |