Bug 1631182

Summary:	kuryr-controller keep restarting after the networks of k8s tenant exceeds quota
Product:	Red Hat OpenStack	Reporter:	Itzik Brown <itbrown>
Component:	openstack-kuryr-kubernetes	Assignee:	Maysa Macedo <mdemaced>
Status:	CLOSED ERRATA	QA Contact:	GenadiC <gcheresh>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	15.0 (Stein)	CC:	asegurap, itbrown, jamsmith, ltomasbo, mdemaced, mdulko, rheslop, tsedovic
Target Milestone:	---	Keywords:	Triaged, ZStream
Target Release:	14.0 (Rocky)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-kuryr-kubernetes-0.6.2-0.20190305141049.a019712.el8ost	Doc Type:	Bug Fix
Doc Text:	Previously, when the number of networks in the Kubernetes project exceeded the quota, the kuryr-controller pod would restart indefinitely due to having been marked as unhealthy. Now, a new readiness check validates the tenant's quota against the available Neutron resources. If the quota is reached, the controller pod is marked as 'Not Ready' and an action is required from the tenant side to increase the quota value or delete resources.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-04-30 17:47:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Itzik Brown 2018-09-20 07:38:56 UTC

Description of problem:
When the number of networks in the k8s project exceeds the quota the kuryr-controller pod restarts indefinitely.

A pod can't be created in a namespace but a pod in the default namespace can.

When networks are deleted so the number of networks is again within the quota limits the kuryr-controller pod stops restarting.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.As described in the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Antoni Segura Puimedon 2018-09-20 09:02:13 UTC

What would be your expected behavior?

Comment 2 Michał Dulko 2018-09-26 09:56:54 UTC

IMO the correct behavior should be as follows:

1. Liveness probe shouldn't be failing. Service is up, though unhealthy.
2. Readiness probe should check the quotas and signal if we have 0 space left with any resource we need.

My only concern is that it seems to be impossible to give any message through the healthcheck mechanism, which is suboptimal.

Comment 3 Itzik Brown 2018-10-22 11:08:35 UTC

I think that Michal's comment should satisfy the requirement.

Comment 21 errata-xmlrpc 2019-04-30 17:47:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0944