Bug 1937005 - when kuryr quotas are unlimited, we should not sent alerts
Summary: when kuryr quotas are unlimited, we should not sent alerts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: All
OS: All
low
low
Target Milestone: ---
: 4.8.0
Assignee: Emilien Macchi
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks: 1937396
TreeView+ depends on / blocked
 
Reported: 2021-03-09 16:30 UTC by Emilien Macchi
Modified: 2024-10-01 17:39 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:52:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1009 0 None open Bug 1937005: kuryr/alerts: change the rule for free count 2021-03-09 16:50:44 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:52:54 UTC

Description Emilien Macchi 2021-03-09 16:30:37 UTC
Description of problem:
if an Openstack networking resource type (e.g. ports, networks, subnets, etc) has unlimited quota (which is represented by -1 in OpenStack), there is an alert that we run low on resources, because the condition to check the free resources doesn't take -1 in account.

Version-Release number of selected component (if applicable):
4.8

How reproducible:
Deploy OpenShift on top of an OpenStack cloud, with Kuryr as a networkType.
Set your OpenStack network quotas to -1.

Steps to Reproduce:
1. As an OpenStack admin, set the OpenStack network quotas to -1 (e.g. for ports), for the tenant that will be used to create the OCP cluster.
2. Deploy OpenShift on top of the OpenStack cloud, using Kuryr as networkType in the install-config.yaml.

Actual results:
Alerts of the type: "Running out of quota for ports"


Expected results:
No alerts, since we have unlimited quotas!

Comment 4 Jon Uriarte 2021-04-16 08:28:50 UTC
Verified in OCP 4.8.0-0.nightly-2021-04-15-141744 on top of OSP 16.1.5 (RHOS-16.1-RHEL-8-20210323.n.0).

Verification steps:

For checking the alerts in prometheus, make sure you have the next entry in the /etc/hosts:
<APPS_FIP> prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com

How to check the alerts from CLI:
$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring`

List all the alerts:
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname'

Get a specific alert (i.e. LimitedResourceQuota):
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname == "LimitedResourceQuota")'

The alert LimitedResourceQuota is raised when an Openstack resource is running out of quota (free quota > 0 and free quota < 10).
It is raised for the resources: 'ports', 'subnets', 'networks', 'security_groups' and 'security_group_rules'.

+++++++++++++++++++++++++++
LimitedResourceQuota: ports
+++++++++++++++++++++++++++

1. Make sure the alert LimitedResourceQuota for ports is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep ports 
| ports                | 1500 
$ openstack port list --project shiftstack -f value | wc -l 
600 

Change the quota so there are < 10 free ports. 
$ openstack quota set --ports 609 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for ports." is raised

2. Check the alert LimitedResourceQuota is cleared when there are 0 free ports (InsuficientResourceQuota will be raised instead).
$ openstack quota set --ports 600 shiftstack

3. Check the alert LimitedResourceQuota is not raised with unlimited ports quota
$ openstack quota set --ports -1 shiftstack

++++++++++++++++++++++++++++++
LimitedResourceQuota: networks
++++++++++++++++++++++++++++++

4. Make sure the alert LimitedResourceQuota for networks is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep networks 
| networks                | 150
$ openstack network list --project shiftstack -f value | wc -l 
67 

Change the quota so there are < 10 free networks. 
$ openstack quota set --networks 76 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for networks." is raised

5. Check the alert LimitedResourceQuota is cleared when there are 0 free networks (InsuficientResourceQuota will be raised instead).
$ openstack quota set --networks 67 shiftstack

6. Check the alert LimitedResourceQuota is not raised with unlimited networks quota
$ openstack quota set --networks -1 shiftstack

+++++++++++++++++++++++++++++
LimitedResourceQuota: subnets
+++++++++++++++++++++++++++++

7. Make sure the alert LimitedResourceQuota for subnets is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep subnets 
| subnets                | 250
$ openstack subnet list --project shiftstack -f value | wc -l 
66 

Change the quota so there are < 10 free subnets. 
$ openstack quota set --subnets 75 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for subnets." is raised

8. Check the alert LimitedResourceQuota is cleared when there are 0 free subnets (InsuficientResourceQuota will be raised instead).
$ openstack quota set --subnets 66 shiftstack

9. Check the alert LimitedResourceQuota is not raised with unlimited subnets quota
$ openstack quota set --subnets -1 shiftstack


+++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security groups
+++++++++++++++++++++++++++++++++++++

8. Make sure the alert LimitedResourceQuota for secgroups is raised: 
$ source overcloudrc 
$ openstack quota show shiftstack | grep secgroups 
| secgroups                | 250
$ openstack security group list --project shiftstack -f value | wc -l 
4 

Change the quota so there are < 10 free secgroups. 
$ openstack quota set --secgroups 13 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_groups." is raised

9. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroups (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroups 4 shiftstack

10. Check the alert LimitedResourceQuota is not raised with unlimited secgroups quota
$ openstack quota set --secgroups -1 shiftstack

++++++++++++++++++++++++++++++++++++++++++
LimitedResourceQuota: security group rules
++++++++++++++++++++++++++++++++++++++++++

11. Make sure the alert LimitedResourceQuota for secgroup-rules is raised: 
$ source overcloudrc
$ openstack quota show shiftstack | grep secgroup-rules 
| secgroup-rules                | 1000
$ source shiftstackrc 
$ openstack security group rule list -f value | wc -l 
67

Change the quota so there are < 10 free secgroup-rules.
$ source overcloudrc
$ openstack quota set --secgroup-rules 76 shiftstack

Check the alert LimitedResourceQuota with message "Running out of quota for security_group_rules." is raised

12. Check the alert LimitedResourceQuota is cleared when there are 0 free secgroup-rules (InsuficientResourceQuota will be raised instead).
$ openstack quota set --secgroup-rules 67 shiftstack

13. Check the alert LimitedResourceQuota is not raised with unlimited secgroup-rules quota
$ openstack quota set --secgroup-rules -1 shiftstack

Comment 8 errata-xmlrpc 2021-07-27 22:52:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Comment 9 Red Hat Bugzilla 2023-09-15 01:03:05 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.