Bug 1687940
Summary: | Creating an IngressController Never Achieves Desired Deployment AvailableReplicas | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Daneyon Hansen <dhansen> |
Component: | Networking | Assignee: | Dan Mace <dmace> |
Networking sub component: | router | QA Contact: | Hongan Li <hongli> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | aos-bugs, dmace |
Version: | 4.1.0 | ||
Target Milestone: | --- | ||
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-06-04 10:45:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Daneyon Hansen
2019-03-12 17:23:43 UTC
After looking at a 'describe' for the pod in question, it does not get scheduled due to anti-affinity rules: $ oc describe po/router-test0-566cfb6db8-zjfsf -n openshift-ingress <SNIP> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2h (x751 over 3h) default-scheduler 0/6 nodes are available: 3 node(s) didn't match node selector, 3 node(s) didn't match pod affinity/anti-affinity, 3 node(s) didn't satisfy existing pods anti-affinity rules. Is it required that router pods from different ingress controllers NOT be scheduled to the same nodes? Yeah, the anti-affinity rule is incomplete. It needs an additional selector to ensure anti-affinity is scoped to a particular ingresscontroller. We change the anti-affinity rule to be preferred rather than required, which should enable horizontal scaling but also allow for surge pods to be scheduled on nodes during a deployment. will verify with next nightly build which contains the fix. verified with 4.0.0-0.nightly-2019-03-23-222829 the issue has been fixed. $ oc get ingresscontrollers.operator.openshift.io test0 -n openshift-ingress-operator -o yaml --- status: availableReplicas: 2 --- $ oc get pod -n openshift-ingress -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE router-default-65dc774d97-6hw5z 1/1 Running 0 13m 10.129.2.12 ip-172-31-134-125.ec2.internal <none> router-default-65dc774d97-b8wh2 1/1 Running 0 13m 10.131.0.12 ip-172-31-151-75.ec2.internal <none> router-default-65dc774d97-mkvmm 1/1 Running 0 12m 10.128.2.10 ip-172-31-162-21.ec2.internal <none> router-test0-649fd8d759-rtgj8 1/1 Running 0 98s 10.131.0.13 ip-172-31-151-75.ec2.internal <none> router-test0-649fd8d759-zcpqs 1/1 Running 0 98s 10.128.2.11 ip-172-31-162-21.ec2.internal <none> Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |