Bug 1894216
Summary: | Improve OpenShift Web Console availability | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Richard Theis <rtheis> |
Component: | Management Console | Assignee: | Jakub Hadvig <jhadvig> |
Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.5 | CC: | aballant, aos-bugs, jokerman, nmukherj, spadgett, yapei |
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Console's pod 'TopologyKey' is set to 'kubernetes.io/hostname'.
Consequence: Console availability problems during the updates and zone outages.
Fix: Use 'TopologyKey' 'topology.kubernetes.io/zone' instead of 'kubernetes.io/hostname'.
Result: OpenShift Web Console has improved availability during the updates and zone outages.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-02-24 15:30:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Richard Theis
2020-11-03 18:21:22 UTC
Initially this looks to be the wrong component; moving to "dev console" as a first best guess. I don't see anything specifically related to edge routing in the description. Haven't got time to investigate this issue so far. Will get to it next sprint. 1. Install a cluster with worker nodes in three different zones $ oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-140-219.us-east-2.compute.internal Ready worker 8h v1.20.0+87544c5 10.0.140.219 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 ip-10-0-148-60.us-east-2.compute.internal Ready master 8h v1.20.0+87544c5 10.0.148.60 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 ip-10-0-175-182.us-east-2.compute.internal Ready worker 8h v1.20.0+87544c5 10.0.175.182 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 ip-10-0-177-3.us-east-2.compute.internal Ready master 8h v1.20.0+87544c5 10.0.177.3 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 ip-10-0-204-128.us-east-2.compute.internal Ready master 8h v1.20.0+87544c5 10.0.204.128 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 ip-10-0-217-234.us-east-2.compute.internal Ready worker 7h58m v1.20.0+87544c5 10.0.217.234 <none> Red Hat Enterprise Linux CoreOS 47.83.202012190438-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 $ for node in $(oc get node --no-headers | awk -F ' ' '{print $1}'); do echo "getting $node"; oc get node $node -o yaml | grep 'topology.kubernetes.io/zone: us-east-2'; done getting ip-10-0-140-219.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2a getting ip-10-0-148-60.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2a getting ip-10-0-175-182.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2b getting ip-10-0-177-3.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2b getting ip-10-0-204-128.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2c getting ip-10-0-217-234.us-east-2.compute.internal topology.kubernetes.io/zone: us-east-2c 2. Check console pods, console pods are in us-east-2b and us-east-2a zone $ oc get pods -n openshift-console -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 6h32m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 6h34m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> 3. Goes to AWS console and stop all worker nodes in us-east-2a zone, that is: ip-10-0-140-219.us-east-2.compute.internal(one worker node) and ip-10-0-148-60.us-east-2.compute.internal(one master node), at the same time we watch console pods and check console accessibility in one terminal, we watch console pods $ while true; do oc get pods -n openshift-console -o wide; sleep 5; done NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h51m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h53m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h53m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h49m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Running 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Running 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-mtzp4 0/1 ContainerCreating 0 3s <none> ip-10-0-204-128.us-east-2.compute.internal <none> <none> console-6685f6b866-tpxmz 1/1 Running 0 7h52m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Terminating 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Terminating 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-sxnk4 0/1 ContainerCreating 0 3s <none> ip-10-0-175-182.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-mtzp4 1/1 Running 0 11s 10.130.0.81 ip-10-0-204-128.us-east-2.compute.internal <none> <none> console-6685f6b866-tpxmz 1/1 Running 0 7h53m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Terminating 0 7h54m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Terminating 0 7h54m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-sxnk4 1/1 Running 0 11s 10.131.1.9 ip-10-0-175-182.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-mtzp4 1/1 Running 0 20s 10.130.0.81 ip-10-0-204-128.us-east-2.compute.internal <none> <none> console-6685f6b866-tpxmz 1/1 Running 0 7h53m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Terminating 0 7h55m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Terminating 0 7h55m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-sxnk4 1/1 Running 0 20s 10.131.1.9 ip-10-0-175-182.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-mtzp4 1/1 Running 0 28s 10.130.0.81 ip-10-0-204-128.us-east-2.compute.internal <none> <none> console-6685f6b866-tpxmz 1/1 Running 0 7h53m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Terminating 0 7h55m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Terminating 0 7h55m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-sxnk4 1/1 Running 0 28s 10.131.1.9 ip-10-0-175-182.us-east-2.compute.internal <none> <none> NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-6685f6b866-mtzp4 1/1 Running 0 36s 10.130.0.81 ip-10-0-204-128.us-east-2.compute.internal <none> <none> console-6685f6b866-tpxmz 1/1 Running 0 7h53m 10.129.0.72 ip-10-0-177-3.us-east-2.compute.internal <none> <none> console-6685f6b866-x8bfc 1/1 Terminating 0 7h55m 10.128.0.71 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-674gr 1/1 Running 0 7h50m 10.129.2.21 ip-10-0-217-234.us-east-2.compute.internal <none> <none> downloads-5468b9795f-cksws 1/1 Terminating 0 7h55m 10.128.0.62 ip-10-0-148-60.us-east-2.compute.internal <none> <none> downloads-5468b9795f-sxnk4 1/1 Running 0 36s 10.131.1.9 ip-10-0-175-182.us-east-2.compute.internal <none> <none> in another terminal, we check console accessibility $ while true; do curl -sk https://console-openshift-console.apps.qe-ui47-1223.qe.devcluster.openshift.com | grep -i 'Application is not available' && date -u ; sleep 1; done During this period, console is always accessible Moving to VERIFIED on 4.7.0-0.nightly-2020-12-21-131655 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |