Bug 1844288
Summary: | A restarted kube-apiserver container hits crashloop due to 6080 port of kube-apiserver-insecure-readyz | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ke Wang <kewang> |
Component: | kube-apiserver | Assignee: | Stefan Schimanski <sttts> |
Status: | CLOSED ERRATA | QA Contact: | Xingxing Xia <xxia> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 4.5 | CC: | aos-bugs, dblack, mfojtik, mkarg, sttts, wjiang, wking, xxia |
Target Milestone: | --- | Keywords: | Regression, Reopened, TestBlocker |
Target Release: | 4.5.0 | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1843752 | Environment: | |
Last Closed: | 2020-07-13 17:43:18 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1843752 | ||
Bug Blocks: | 1851831 |
Description
Ke Wang
2020-06-05 01:49:00 UTC
Our OCP 4.5 tests on osp16 is blocked by this bug, clone one to 4.5. Not only osp16, GCP as well, https://bugzilla.redhat.com/show_bug.cgi?id=1838421. https://bugzilla.redhat.com/show_bug.cgi?id=1838421#c19 encountered the same error, State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Message: ...............................................................................timed out waiting for port :6443 and :6080 to be released *** This bug has been marked as a duplicate of bug 1843752 *** Reopening point back to 1843752 as its parent. So far, we've found this problem on several platforms, including Red Hat OpenStack Platform 16.0, Google Cloud Platform and vSphere. Verified in 4.5.0-0.nightly-2020-06-09-223121: In the kube-apiserver pod yaml, kube-apiserver container has no checking of port 6080 now, as the PR https://github.com/openshift/cluster-kube-apiserver-operator/pull/878/files . Repeatedly rollout: $ scripts/rollout.sh | tee logs/rollout.log | grep -i -e "checking" -e crash -e "timed out" Didn't see Crash and "timed out" $ cat scripts/rollout.sh #!/bin/bash i=0; while true do DATE="$(date)" let i+=1; echo "$i time rollout $DATE" oc patch kubeapiserver/cluster --type=json -p '[ {"op": "replace", "path": "/spec/forceRedeploymentReason", "value": "xxia forced test'"$i time rollout $DATE"'" } ]' sleep 60 while true; do echo "checking status $(date)" oc get po -n openshift-kube-apiserver --show-labels -l apiserver oc get po -n openshift-kube-apiserver -l apiserver -o json | jq '.items[].status' if oc get co kube-apiserver | grep "True.*False.*False"; then break fi sleep 10 done done We have successful installation with latest OCP 4.5 on Red Hat OpenStack Platform 16.0, this bug is not blocking related tests. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |