Bug 1937400 - when kuryr quotas are unlimited, we should not sent alerts
Summary: when kuryr quotas are unlimited, we should not sent alerts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: All
OS: All
low
medium
Target Milestone: ---
: 4.6.z
Assignee: Emilien Macchi
QA Contact: Jon Uriarte
URL:
Whiteboard:
: 1939357 (view as bug list)
Depends On: 1937396
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-10 15:02 UTC by OpenShift BugZilla Robot
Modified: 2021-07-14 07:16 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-14 07:16:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1012 0 None open [release-4.6] Bug 1937400: kuryr/alerts: change the rule for free count 2021-04-09 13:19:40 UTC
Red Hat Product Errata RHBA-2021:2641 0 None None None 2021-07-14 07:16:53 UTC

Description OpenShift BugZilla Robot 2021-03-10 15:02:33 UTC
+++ This bug was initially created as a clone of Bug #1937005 +++

Description of problem:
if an Openstack networking resource type (e.g. ports, networks, subnets, etc) has unlimited quota (which is represented by -1 in OpenStack), there is an alert that we run low on resources, because the condition to check the free resources doesn't take -1 in account.

Version-Release number of selected component (if applicable):
4.8

How reproducible:
Deploy OpenShift on top of an OpenStack cloud, with Kuryr as a networkType.
Set your OpenStack network quotas to -1.

Steps to Reproduce:
1. As an OpenStack admin, set the OpenStack network quotas to -1 (e.g. for ports), for the tenant that will be used to create the OCP cluster.
2. Deploy OpenShift on top of the OpenStack cloud, using Kuryr as networkType in the install-config.yaml.

Actual results:
Alerts of the type: "Running out of quota for ports"


Expected results:
No alerts, since we have unlimited quotas!

Comment 1 Michał Dulko 2021-03-22 15:39:24 UTC
*** Bug 1939357 has been marked as a duplicate of this bug. ***

Comment 6 Michał Dulko 2021-05-27 13:06:04 UTC
Bumping the severity per Gabriel's comment.

Comment 13 Jon Uriarte 2021-06-29 09:43:41 UTC
Verified in OCP 4.6.0-0.nightly-2021-06-25-031210 on top of OSP 13.0.15 (2021-03-24.1).

Verification steps:

For checking the alerts in prometheus, make sure you have the next entry in the /etc/hosts:
<APPS_FIP> prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com

How to check the alerts from CLI:
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring`

List all the alerts:
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname'

Get a specific alert (i.e. LimitedResourceQuota):
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname == "LimitedResourceQuota")'

The alert LimitedResourceQuota is raised when an Openstack resource is running out of quota (free quota > 0 and free quota < 10).
It is raised for the resources: 'ports', 'subnets', 'networks', 'security_groups' and 'security_group_rules'.

+++++++++++++++++++++++++++
LimitedResourceQuota: ports
+++++++++++++++++++++++++++

1. Make sure the alert LimitedResourceQuota for ports is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep ports
| ports                | 1500 
$ openstack port list --project shiftstack -f value | wc -l
450

Change the quota so there are < 10 free ports. 
$ openstack quota set --ports 459 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for ports." is raised

2. Check the alert LimitedResourceQuota is cleared when there are 0 free ports (InsuficientResourceQuota will be raised instead).
$ openstack quota set --ports 450 shiftstack

3. Check the alert LimitedResourceQuota is not raised with unlimited ports quota
$ openstack quota set --ports -1 shiftstack


++++++++++++++++++++++++++++++
LimitedResourceQuota: networks
++++++++++++++++++++++++++++++

4. Make sure the alert LimitedResourceQuota for networks is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep networks
| networks                | 250
$ openstack network list --project shiftstack -f value | wc -l
61

Change the quota so there are < 10 free networks. 
$ openstack quota set --networks 70 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for networks." is raised

5. Check the alert LimitedResourceQuota is cleared when there are 0 free networks (InsuficientResourceQuota will be raised instead).
$ openstack quota set --networks 61 shiftstack

6. Check the alert LimitedResourceQuota is not raised with unlimited networks quota
$ openstack quota set --networks -1 shiftstack


+++++++++++++++++++++++++++++
LimitedResourceQuota: subnets
+++++++++++++++++++++++++++++

7. Make sure the alert LimitedResourceQuota for subnets is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep subnets
| subnets                | 250
$ openstack subnet list --project shiftstack -f value | wc -l
61

Change the quota so there are < 10 free subnets. 
$ openstack quota set --subnets 70 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for subnets." is raised

8. Check the alert LimitedResourceQuota is cleared when there are 0 free subnets (InsuficientResourceQuota will be raised instead).
$ openstack quota set --subnets 61 shiftstack

9. Check the alert LimitedResourceQuota is not raised with unlimited subnets quota
$ openstack quota set --subnets -1 shiftstack


+++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security groups
+++++++++++++++++++++++++++++++++++++

10. Make sure the alert LimitedResourceQuota for secgroups is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep secgroups
| secgroups                | 250
$ openstack security group list --project shiftstack -f value | wc -l
5

Change the quota so there are < 10 free secgroups. 
$ openstack quota set --secgroups 14 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_groups." is raised

11. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroups (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroups 5 shiftstack

12. Check the alert LimitedResourceQuota is not raised with unlimited secgroups quota
$ openstack quota set --secgroups -1 shiftstack

++++++++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security group rules
++++++++++++++++++++++++++++++++++++++++++

13. Make sure the alert LimitedResourceQuota for secgroup-rules is raised: 
$ source overcloudrc
$ openstack quota show shiftstack | grep secgroup-rules
| secgroup-rules                | 1000
$ source shiftstackrc 
$ openstack security group rule list -f value | wc -l
67

Change the quota so there are < 10 free secgroup-rules.
$ source overcloudrc
$ openstack quota set --secgroup-rules 76 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_group_rules." is raised

14. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroup-rules (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroup-rules 67 shiftstack

15. Check the alert LimitedResourceQuota is not raised with unlimited secgroup-rules quota
$ openstack quota set --secgroup-rules -1 shiftstack

Comment 18 errata-xmlrpc 2021-07-14 07:16:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.38 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2641


Note You need to log in before you can comment on or make changes to this bug.