Bug 2028966 - Whereabouts should reconcile stranded IP addresses
Summary: Whereabouts should reconcile stranded IP addresses
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.z
Assignee: Douglas Smith
QA Contact: Weibin Liang
URL:
Whiteboard:
Depends On: 2028964
Blocks: 2028967
TreeView+ depends on / blocked
 
Reported: 2021-12-03 20:27 UTC by Douglas Smith
Modified: 2022-02-16 06:52 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Feature: Implements an IP reconciliation job for Whereabouts IPAM CNI called "ip-reconciler" which runs as a Kubernetes cronjob. Reason: On occasion events occur where the CNI DEL action will not complete for a given pod (for example, a forcefully powered off node), and in such a case stored IP address allocations may be left stranded and unable to be allocated without manual intervention. Result: Stranded IP address allocations are garbage collected automatically on a periodic basis to free unused IP addresses.
Clone Of: 2028964
: 2028967 (view as bug list)
Environment:
Last Closed: 2022-02-16 06:51:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift whereabouts-cni pull 78 0 None open Bug 2028966: Whereabouts should reconcile IP addresses [backport 4.8] 2022-01-25 20:58:07 UTC
Red Hat Product Errata RHBA-2022:0484 0 None None None 2022-02-16 06:51:59 UTC

Description Douglas Smith 2021-12-03 20:27:18 UTC
+++ This bug was initially created as a clone of Bug #2028964 +++

+++ This bug was initially created as a clone of Bug #2028963 +++

Description of problem: IP reconciliation is a feature in the latest whereabouts, and due to reports, this feature should be backported all the way to 4.6.z. The feature is in the form of a cron job which reconciles the IP addresses.


Version-Release number of selected component (if applicable): 4.6-4.10


How reproducible: Specialized. Customers often experience this when nodes are rebooted, or pods are force deleted, and therefore CNI DEL calls aren't processed in their entirety by Whereabouts


Steps to Reproduce: (We will produce a procedure which produces orphaned IP addresses)

Actual results: IP addresses will remain stranded, and never utilized again.


Expected results: IP addresses that were stranded become available for use again.


Additional info: 4.10 has the reconciliation code but still requires a bug fix from upstream.

Comment 1 Weibin Liang 2022-01-25 22:13:34 UTC
@dosmith

Use cluster-bot to create testing PR#78 cluster: launch https://github.com/openshift/whereabouts-cni/pull/78

But got nothing when oc get cronjobs -n openshift-multus

[weliang@weliang ~]$ oc get clusterversion
NAME      VERSION                                                  AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.ci.test-2022-01-25-212450-ci-ln-d20n1p2-latest   True        False         8m27s   Cluster version is 4.8.0-0.ci.test-2022-01-25-212450-ci-ln-d20n1p2-latest
[weliang@weliang ~]$ oc get cronjobs -n openshift-multus
No resources found in openshift-multus namespace.
[weliang@weliang ~]$ oc version
Client Version: 4.8.0-0.nightly-2022-01-24-131630
Server Version: 4.8.0-0.ci.test-2022-01-25-212450-ci-ln-d20n1p2-latest
Kubernetes Version: v1.21.2-1639+bb8d50ab72040c-dirty

Comment 4 Weibin Liang 2022-01-27 21:11:08 UTC
Tested and verified in 4.8.0-0.nightly-2022-01-27-134250


[weliang@weliang Test]$ oc create -f ippool.yml -n openshift-multus
ippool.whereabouts.cni.cncf.io/192.168.2.224-28 created
[weliang@weliang Test]$ oc get ippools 192.168.2.224-28 -o yaml -n openshift-multus
apiVersion: whereabouts.cni.cncf.io/v1alpha1
kind: IPPool
metadata:
  creationTimestamp: "2022-01-27T21:09:42Z"
  generation: 1
  name: 192.168.2.224-28
  namespace: openshift-multus
  resourceVersion: "37863"
  uid: 3a5dbb15-1c27-4ee6-9ad6-e629e6d128e2
spec:
  allocations:
    "1":
      id: f7559e44472d139ce9333d7f6094c81866eb65a0d62cce5576a8b89990011cd9
      podref: default/wbsamplepod
  range: 192.168.2.224/28
[weliang@weliang Test]$ oc create job --from=cronjob/ip-reconciler -n openshift-multus testrun-ip-reconciler
job.batch/testrun-ip-reconciler created
[weliang@weliang Test]$ oc get ippools 192.168.2.224-28 -o yaml -n openshift-multus
apiVersion: whereabouts.cni.cncf.io/v1alpha1
kind: IPPool
metadata:
  creationTimestamp: "2022-01-27T21:09:42Z"
  generation: 2
  name: 192.168.2.224-28
  namespace: openshift-multus
  resourceVersion: "38032"
  uid: 3a5dbb15-1c27-4ee6-9ad6-e629e6d128e2
spec:
  allocations: {}
  range: 192.168.2.224/28
[weliang@weliang Test]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2022-01-27-134250   True        False         7m16s   Cluster version is 4.8.0-0.nightly-2022-01-27-134250
[weliang@weliang Test]$

Comment 7 errata-xmlrpc 2022-02-16 06:51:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.31 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0484


Note You need to log in before you can comment on or make changes to this bug.