Bug 1937396

Summary: when kuryr quotas are unlimited, we should not sent alerts
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NetworkingAssignee: Emilien Macchi <emacchi>
Networking sub component: kuryr QA Contact: Jon Uriarte <juriarte>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: low CC: dcbw, mdulko
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-15 09:26:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1937005    
Bug Blocks: 1937400    

Description OpenShift BugZilla Robot 2021-03-10 14:52:44 UTC
+++ This bug was initially created as a clone of Bug #1937005 +++

Description of problem:
if an Openstack networking resource type (e.g. ports, networks, subnets, etc) has unlimited quota (which is represented by -1 in OpenStack), there is an alert that we run low on resources, because the condition to check the free resources doesn't take -1 in account.

Version-Release number of selected component (if applicable):
4.8

How reproducible:
Deploy OpenShift on top of an OpenStack cloud, with Kuryr as a networkType.
Set your OpenStack network quotas to -1.

Steps to Reproduce:
1. As an OpenStack admin, set the OpenStack network quotas to -1 (e.g. for ports), for the tenant that will be used to create the OCP cluster.
2. Deploy OpenShift on top of the OpenStack cloud, using Kuryr as networkType in the install-config.yaml.

Actual results:
Alerts of the type: "Running out of quota for ports"


Expected results:
No alerts, since we have unlimited quotas!

Comment 4 Jon Uriarte 2021-06-02 10:50:47 UTC
Verified in OCP 4.7.0-0.nightly-2021-06-01-051359 on top of OSP 16.1.5 (RHOS-16.1-RHEL-8-20210323.n.0).

Verification steps:

For checking the alerts in prometheus, make sure you have the next entry in the /etc/hosts:
<APPS_FIP> prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com

How to check the alerts from CLI:
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring`

List all the alerts:
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname'

Get a specific alert (i.e. LimitedResourceQuota):
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname == "LimitedResourceQuota")'

The alert LimitedResourceQuota is raised when an Openstack resource is running out of quota (free quota > 0 and free quota < 10).
It is raised for the resources: 'ports', 'subnets', 'networks', 'security_groups' and 'security_group_rules'.

+++++++++++++++++++++++++++
LimitedResourceQuota: ports
+++++++++++++++++++++++++++

1. Make sure the alert LimitedResourceQuota for ports is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep ports 
| ports                | 1500 
$ openstack port list --project shiftstack -f value | wc -l
538

Change the quota so there are < 10 free ports. 
$ openstack quota set --ports 547 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for ports." is raised

2. Check the alert LimitedResourceQuota is cleared when there are 0 free ports (InsuficientResourceQuota will be raised instead).
$ openstack quota set --ports 538 shiftstack

3. Check the alert LimitedResourceQuota is not raised with unlimited ports quota
$ openstack quota set --ports -1 shiftstack


++++++++++++++++++++++++++++++
LimitedResourceQuota: networks
++++++++++++++++++++++++++++++

4. Make sure the alert LimitedResourceQuota for networks is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep networks
| networks                | 250
$ openstack network list --project shiftstack -f value | wc -l 
64

Change the quota so there are < 10 free networks. 
$ openstack quota set --networks 73 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for networks." is raised

5. Check the alert LimitedResourceQuota is cleared when there are 0 free networks (InsuficientResourceQuota will be raised instead).
$ openstack quota set --networks 64 shiftstack

6. Check the alert LimitedResourceQuota is not raised with unlimited networks quota
$ openstack quota set --networks -1 shiftstack


+++++++++++++++++++++++++++++
LimitedResourceQuota: subnets
+++++++++++++++++++++++++++++

7. Make sure the alert LimitedResourceQuota for subnets is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep subnets
| subnets                | 250
$ openstack subnet list --project shiftstack -f value | wc -l
64

Change the quota so there are < 10 free subnets. 
$ openstack quota set --subnets 73 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for subnets." is raised

8. Check the alert LimitedResourceQuota is cleared when there are 0 free subnets (InsuficientResourceQuota will be raised instead).
$ openstack quota set --subnets 64 shiftstack

9. Check the alert LimitedResourceQuota is not raised with unlimited subnets quota
$ openstack quota set --subnets -1 shiftstack


+++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security groups
+++++++++++++++++++++++++++++++++++++

10. Make sure the alert LimitedResourceQuota for secgroups is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep secgroups
| secgroups                | 250
$ openstack security group list --project shiftstack -f value | wc -l
4 

Change the quota so there are < 10 free secgroups. 
$ openstack quota set --secgroups 13 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_groups." is raised

11. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroups (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroups 4 shiftstack

12. Check the alert LimitedResourceQuota is not raised with unlimited secgroups quota
$ openstack quota set --secgroups -1 shiftstack

++++++++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security group rules
++++++++++++++++++++++++++++++++++++++++++

13. Make sure the alert LimitedResourceQuota for secgroup-rules is raised: 
$ source overcloudrc
$ openstack quota show shiftstack | grep secgroup-rules
| secgroup-rules                | 1000
$ source shiftstackrc 
$ openstack security group rule list -f value | wc -l
67

Change the quota so there are < 10 free secgroup-rules.
$ source overcloudrc
$ openstack quota set --secgroup-rules 76 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_group_rules." is raised

14. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroup-rules (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroup-rules 67 shiftstack

15. Check the alert LimitedResourceQuota is not raised with unlimited secgroup-rules quota
$ openstack quota set --secgroup-rules -1 shiftstack

Comment 5 Siddharth Sharma 2021-06-04 18:38:29 UTC
This bug will be shipped as part of next z-stream release 4.7.15 on June 14th, as 4.7.14 was dropped due to a regression https://bugzilla.redhat.com/show_bug.cgi?id=1967614

Comment 9 errata-xmlrpc 2021-06-15 09:26:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2286