Bug 1631182

Summary: kuryr-controller keep restarting after the networks of k8s tenant exceeds quota
Product: Red Hat OpenStack Reporter: Itzik Brown <itbrown>
Component: openstack-kuryr-kubernetesAssignee: Maysa Macedo <mdemaced>
Status: CLOSED ERRATA QA Contact: GenadiC <gcheresh>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: asegurap, itbrown, jamsmith, ltomasbo, mdemaced, mdulko, rheslop, tsedovic
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-kuryr-kubernetes-0.6.2-0.20190305141049.a019712.el8ost Doc Type: Bug Fix
Doc Text:
Previously, when the number of networks in the Kubernetes project exceeded the quota, the kuryr-controller pod would restart indefinitely due to having been marked as unhealthy. Now, a new readiness check validates the tenant's quota against the available Neutron resources. If the quota is reached, the controller pod is marked as 'Not Ready' and an action is required from the tenant side to increase the quota value or delete resources.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-30 17:47:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Itzik Brown 2018-09-20 07:38:56 UTC
Description of problem:
When the number of networks in the k8s project exceeds the quota the kuryr-controller pod restarts indefinitely.

A pod can't be created in a namespace but a pod in the default namespace can.

When networks are deleted so the number of networks is again within the quota limits the kuryr-controller pod stops restarting.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.As described in the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Antoni Segura Puimedon 2018-09-20 09:02:13 UTC
What would be your expected behavior?

Comment 2 MichaƂ Dulko 2018-09-26 09:56:54 UTC
IMO the correct behavior should be as follows:

1. Liveness probe shouldn't be failing. Service is up, though unhealthy.
2. Readiness probe should check the quotas and signal if we have 0 space left with any resource we need.

My only concern is that it seems to be impossible to give any message through the healthcheck mechanism, which is suboptimal.

Comment 3 Itzik Brown 2018-10-22 11:08:35 UTC
I think that Michal's comment should satisfy the requirement.

Comment 21 errata-xmlrpc 2019-04-30 17:47:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0944