Description of problem: Kuryr alerts LimitedPortsOnNetwork and InsuficientPortsOnNetwork are triggered based on the free port count in a given namespace subnet. LimitedPortsOnNetwork is triggered when there are less than 11 available ports (it should be when there are less than 10). InsuficientPortsOnNetwork is triggered when there is 1 available port (it should be when there are no available ports). Version-Release number of selected component (if applicable): OCP 4.6.0-0.nightly-2020-10-20-101225 OSP13 2020-10-06.2 How reproducible: always Steps to Reproduce: 1. Create new project (will create a /23 subnet -> room for 510 ports) $ oc new-project test 2. Create pods until ~490 ports are created $ for i in `seq 1 490`; do oc run --image kuryr/demo demo-$i; sleep 4; done 3. Create ports manually in the subnet until there are 11 available ports (499 ports in use) $ openstack port create --network <network-id> <port-name> The alert LimitedPortsOnNetwork is raised, while it should be raised when there are 10 available ports. 4. Create ports manually in the subnet until there is 1 available port (509 ports in use) $ openstack port create --network <network-id> <port-name> The alert InsuficientPortsOnNetwork is raised, while it should be raised when there are 0 available ports. Additional info: def _record_ports_quota_per_subnet_metric(self): """Records the ports quota per subnet to the registry""" subnets = self._os_net.subnets(project_id=self._project_id) namespace_prefix = 'ns/' for subnet in subnets: if namespace_prefix not in subnet.name: continue total_num_addresses = 0 ports_availability = 0 for allocation in subnet.allocation_pools: total_num_addresses += netaddr.IPRange( netaddr.IPAddress(allocation['start']), netaddr.IPAddress(allocation['end'])).size ports_count = len(list(self._os_net.ports( network_id=subnet.network_id, project_id=self._project_id))) labels = {'subnet_id': subnet.id, 'subnet_name': subnet.name} ports_availability = total_num_addresses-ports_count self.port_quota_per_subnet.labels(**labels).set(ports_availability) The total_num_addresses is calculated base on the allocation pool: allocation_pools | 10.128.116.2-10.128.117.254 which doesn't contain the allocation for the .1 (so it's 509) The ports_count instead (ports in use) counts the .1 (so it can increase up to 510). The calculation could be fixed by removing 1 port from ports_count (the one belonging to .1), or adding 1 port to total_num_addresses.
Adding an easier reproducer: 1. Create new project (will create a /23 subnet -> room for 510 ports) $ oc new-project test It will create the port for .1 2. Create 499 ports $ for i in `seq 2 500`; do openstack port create --network <network_id> port--$i; sleep 3; done This will create 499 ports and the alert LimitedPortsOnNetwork will be raised, but there are still 10 available ports. 3. Create 9 more ports $ for i in `seq 501 509`; do openstack port create --network <network_id> port--$i; sleep 3; done This will create 9 additional ports and the alert InsuficientPortsOnNetwork will be raised, but there is still 1 available port. 4. Create an additional port to check there is still one available port $ openstack port create --network <network_id> port--510 InsuficientPortsOnNetwork alert has been cleared due to the available ports is now -1 (different to 0) 5. Try creating an additional port to check it's not possible $ openstack port create --network <network_id> port--511 HttpException: 409: Client Error for url: https://10.46.44.10:13696/v2.0/ports, {"NeutronError": {"message": "No more IP addresses available on network <network_id>.", "type": "IpAddressGenerationFailure", "detail": ""}}
Failed on OCP4.7.0-0.nightly-2020-11-18-203317 over OSP16.1 with OVN-Octavia (RHOS-16.1-RHEL-8-20201110.n.1) InsuficientPortsOnNetwork is raised when 1 port is available and it is cleared when 0 ports are available. # Creating 499 ports on the ns/test2-subnet: $ oc new-project test2 $ for i in `seq 2 499`; do openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--$i; sleep 3; done $ openstack subnet list | grep test2 | eddc9862-1f09-4fd7-992d-dc01a76f9ef9 | ns/test2-subnet | e0d9964f-b3d5-464d-81ac-b4bc49fcb75b | 10.128.120.0/23 | $ openstack port list -f value | grep eddc9862-1f09-4fd7-992d-dc01a76f9ef9 | wc -l 499 # No alarms raised: $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog # Alarm LimitedPortsOnNetwork raised when there are 10 available ports: $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--500 $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "LimitedPortsOnNetwork" # Creating 9 ports so 1 port available only. Alarm InsuficientPortsOnNetwork is wrongly raised. $ for i in `seq 501 509`; do openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--$i; sleep 3; done curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "LimitedPortsOnNetwork" "InsuficientPortsOnNetwork" # When creating the remaining port (0 ports available), the InsuficientPortsOnNetwork is cleared. $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--510 $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "LimitedPortsOnNetwork" $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--511 ConflictException: 409: Client Error for url: https://overcloud.redhat.local:13696/v2.0/ports, No more IP addresses available on network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b.
Verified on 4.7.0-0.nightly-2020-12-09-112139 on top of OSP16.1 with OVN-Octavia (RHOS-16.1-RHEL-8-20201124.n.0). # Creating 499 ports on the ns/test2-subnet: $ oc new-project test2 $ for i in `seq 2 500`; do openstack port create --network 9bf10b51-1014-4629-b55b-cc39b43c1544 port--$i; sleep 3; done $ openstack subnet list | grep test2 | eddc9862-1f09-4fd7-992d-dc01a76f9ef9 | ns/test2-subnet | e0d9964f-b3d5-464d-81ac-b4bc49fcb75b | 10.128.120.0/23 | $ openstack port list -f value | grep eddc9862-1f09-4fd7-992d-dc01a76f9ef9 | wc -l 500 # No alarms raised: $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog # Alarm LimitedPortsOnNetwork raised when there are 9 available ports: $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--501 $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "LimitedPortsOnNetwork" # Creating 8 ports so 1 port available only. Alarm InsuficientPortsOnNetwork not raised. $ for i in `seq 501 509`; do openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--$i; sleep 3; done $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "LimitedPortsOnNetwork" # When creating the remaining port (0 ports available), the InsuficientPortsOnNetwork is raised. $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--510 $ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' | grep -v -e AlertmanagerReceiversNotConfigured -e CannotRetrieveUpdates -e Watchdog "InsuficientPortsOnNetwork" $ openstack port create --network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b port--511 ConflictException: 409: Client Error for url: https://overcloud.redhat.local:13696/v2.0/ports, No more IP addresses available on network e0d9964f-b3d5-464d-81ac-b4bc49fcb75b. Removing the ports will clear the alarms as expected.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633