Bug 2006714
Summary: | add retry for etcd errors in kube-apiserver | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Lukasz Szaszkiewicz <lszaszki> | |
Component: | kube-apiserver | Assignee: | Lukasz Szaszkiewicz <lszaszki> | |
Status: | CLOSED ERRATA | QA Contact: | Ke Wang <kewang> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.10 | CC: | aos-bugs, mfojtik, xxia | |
Target Milestone: | --- | |||
Target Release: | 4.10.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2006717 (view as bug list) | Environment: | ||
Last Closed: | 2022-03-10 16:12:32 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2006717 |
Description
Lukasz Szaszkiewicz
2021-09-22 09:38:54 UTC
Compared the results between 4.9 and 4.10 with PR fix. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-09-25-094414 True False 79m Cluster version is 4.9.0-0.nightly-2021-09-25-094414 $ masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}') $ for node in $masters; do echo $node;oc debug no/$node -- chroot /host bash -c "grep -ir 'etcdserver: leader changed' /var/log/ | grep -v debug";done | grep kube-apiserver | wc -l ... 12 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-09-23-210724 True False 12m Cluster version is 4.10.0-0.nightly-2021-09-23-210724 $ masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}') $ for node in $masters; do echo $node;oc debug no/$node -- chroot /host bash -c "grep -ir 'etcdserver: leader changed' /var/log/ | grep -v debug";done | grep kube-apiserver | wc -l ... /var/log/kube-apiserver/termination.log:{"level":"warn","ts":"2021-09-26T11:11:24.026Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00315e380/#initially=[https://10.0.0.2:2379;https://10.0.0.4:2379;https://10.0.0.5:2379;https://localhost:2379]","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: leader changed"} /var/log/kube-apiserver/termination.log:{"level":"warn","ts":"2021-09-26T11:11:24.026Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0008f3500/#initially=[https://10.0.0.2:2379;https://10.0.0.4:2379;https://10.0.0.5:2379;https://localhost:2379]","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: leader changed"} Above warning logs from etcd-client, not from by the API server, so on 4.10, won't see certain errors of etcd, move the bug VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |