Bug 2066560
| Summary: | two router pods are in ContainerCreating status when tried to patch ingress-operator with custom error code pages directly | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Shudi Li <shudili> |
| Component: | Networking | Assignee: | Grant Spence <gspence> |
| Networking sub component: | router | QA Contact: | Shudi Li <shudili> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | amcdermo, aos-bugs, gspence, hongli, jaldinge, mmasters |
| Version: | 4.11 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.12.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
*Previously, if a `configmap` that the router deployment depends on is not created, then the router deployment does not progress. With this update, the cluster Operator reports `ingress progressing=true` if the default ingress controller deployment is progressing. This results in users debugging issues with the ingress controller by using the command `oc get co`. (link:https://bugzilla.redhat.com/show_bug.cgi?id=2066560[*BZ#2066560*])
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-17 19:47:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
If the configmap doesn't exist, then the deployment's pods won't start until the configmap is created, which is the intended behavior. In your example, the deployment already had pods, so the deployment controller will leave those pods running until the new pods become ready. I think this is the appropriate behavior. What might need to be improved is the status reporting: Instead of reporting Progressing=False, the operator could report Progressing=True while the deployment is in the middle of rolling out the new pods. Verified it with 4.12.0-0.nightly-2022-09-18-141547, progressing=True when the pod was rolling out
1.
% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.12.0-0.nightly-2022-09-18-141547 True False 46m Cluster version is 4.12.0-0.nightly-2022-09-18-141547
%
2.
% oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"httpErrorCodePages":{"name":"my-custom-error-code-pages"}}}' --type=merge
ingresscontroller.operator.openshift.io/default patched
%
3.
% oc -n openshift-ingress get pods
NAME READY STATUS RESTARTS AGE
router-default-5c4c557b74-rczcq 1/1 Running 0 39m
router-default-6f6d9f454f-cfspr 0/1 ContainerCreating 0 96s
router-default-6f6d9f454f-rbvlg 0/1 ContainerCreating 0 96s
router-ocp50074-65c58cfc69-dccwv 1/1 Running 0 33m
%
4.
% oc -n openshift-ingress-operator get ingresscontroller/default -o yaml | grep Progressing -B2
- lastTransitionTime: "2022-09-19T01:41:41Z"
message: LoadBalancer is not progressing
reason: LoadBalancerNotProgressing
status: "False"
type: LoadBalancerProgressing
--
One or more status conditions indicate progressing: DeploymentRollingOut=True (DeploymentRollingOut: Waiting for router deployment rollout to finish: 1 old replica(s) are pending termination...
)
reason: IngressControllerProgressing
status: "True"
type: Progressing
%
5.
% oc -n openshift-ingress get deployment/router-default -o yaml | grep Progressing -B4
lastUpdateTime: "2022-09-19T02:47:55Z"
message: ReplicaSet "router-default-6f6d9f454f" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
%
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399 |
Description of problem: Without preparing the custom error page(create a customized error page configmap in openshift-config namespace), but patch ingress-operator directly, two router pods are always in ContainerCreating status. OpenShift release version: 4.11.0-0.nightly-2022-03-20-160505(4.9.0-0.nightly-2022-03-21-144414 has the same issue) Cluster Platform: How reproducible: Steps to Reproduce (in detail): 1. % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-03-20-160505 True False 149m Cluster version is 4.11.0-0.nightly-2022-03-20-160505 % 2. % oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"httpErrorCodePages":{"name":"my-custom-error-code-pages"}}}' --type=merge ingresscontroller.operator.openshift.io/default patched % 3. % oc -n openshift-ingress get pods NAME READY STATUS RESTARTS AGE router-default-687b85889f-brxjf 0/1 ContainerCreating 0 54m router-default-687b85889f-qdxrc 0/1 ContainerCreating 0 54m router-default-84c787c96b-rsmvf 1/1 Running 0 58m % Actual results: Trying to create two router pods Expected results: Two new router are created and the two old router pods are deleted Impact of the problem: Additional info: ** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.