Bug 1942740
Summary: | [sig-arch] Check if alerts are firing during or after upgrade success | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Michael Gugino <mgugino> |
Component: | kube-apiserver | Assignee: | Stefan Schimanski <sttts> |
Status: | CLOSED DUPLICATE | QA Contact: | Ke Wang <kewang> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.8 | CC: | aos-bugs, mfojtik, wking, xxia |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: |
[sig-arch] Check if alerts are firing during or after upgrade success
|
|
Last Closed: | 2021-03-25 08:47:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michael Gugino
2021-03-24 20:02:28 UTC
In the linked job [1], the relevant API-server alert was: alert AggregatedAPIDown fired for 180 seconds with labels: {name="v1beta1.metrics.k8s.io", namespace="default", severity="warning"} I'm a bit fuzzy on the details, but this might be a dup of bug 1928946. If so, probably mention the AggregatedAPIDown alert and: [sig-arch] Check if alerts are firing during or after upgrade success test-case in that bug, to help Sippy find it. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade/1374760429821104128 In the same run: 4 kube-apiserver reports a non-graceful termination. Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components. ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-219-80.ec2.internal node/ip-10-0-219-80 - reason/NonGracefulTermination Previous pod kube-apiserver-ip-10-0-219-80.ec2.internal started at 2021-03-24 17:47:35.168007668 +0000 UTC did not terminate gracefully ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-129-73.ec2.internal node/ip-10-0-129-73 - reason/NonGracefulTermination Previous pod kube-apiserver-ip-10-0-129-73.ec2.internal started at 2021-03-24 17:51:42.007655855 +0000 UTC did not terminate gracefully ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-129-73.ec2.internal node/ip-10-0-129-73 - reason/NonGracefulTermination Previous pod kube-apiserver-ip-10-0-129-73.ec2.internal started at 2021-03-24 17:51:42.007655855 +0000 UTC did not terminate gracefully ns/openshift-kube-apiserver pod/kube-apiserver-ip-10-0-145-108.ec2.internal node/ip-10-0-145-108 - reason/NonGracefulTermination Previous pod kube-apiserver-ip-10-0-145-108.ec2.internal started at 2021-03-24 17:56:45.743304935 +0000 UTC did not terminate gracefully *** This bug has been marked as a duplicate of bug 1928946 *** |