1898672 – Pod gets stuck in ContainerCreating state with exhausted Whereabouts IPAM range with a daemonset

Bug 1898672 - Pod gets stuck in ContainerCreating state with exhausted Whereabouts IPAM range with a daemonset

Summary: Pod gets stuck in ContainerCreating state with exhausted Whereabouts IPAM ran...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	4.6.z
Assignee:	Douglas Smith
QA Contact:	Weibin Liang
Docs Contact:
URL:
Whiteboard:
Depends On:	1898613
Blocks:	1898675
TreeView+	depends on / blocked

Reported:	2020-11-17 19:09 UTC by Douglas Smith
Modified:	2024-10-01 17:05 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1898670
Clones:	1898675 (view as bug list)
Environment:
Last Closed:	2021-02-08 13:50:51 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift whereabouts-cni pull 40	0	None	closed	Bug 1898672: Removes error when deallocating IP errors out, instead just warns [backport 4.6]	2021-02-16 07:46:16 UTC
Red Hat Product Errata	RHSA-2021:0308	0	None	None	None	2021-02-08 13:51:05 UTC

Description Douglas Smith 2020-11-17 19:09:49 UTC

+++ This bug was initially created as a clone of Bug #1898670 +++

Description of problem: When using a daemonset and a Whereabouts range gets exhausted, a pod can get stuck in a containercreating state indefinitely.

See: https://github.com/intel/multus-cni/issues/578


How reproducible: Always.


Steps to Reproduce:

Create a net-attach-def:

```
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
 name: whereabouts-nad2
 namespace: whereabouts
spec:
 config: '{
  "cniVersion": "0.3.1",
  "type": "macvlan",
  "master": "eth0",
  "ipam": {
   "type": "whereabouts",
   "range": "10.10.10.0/24",
   "range_start": "10.10.10.100",
   "range_end": "10.10.10.128",
   "exclude": ["10.10.10.0/25"]
  }
 }'
```

Create a deployment:

```
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
  run: multustest2
 name: multustest2
 namespace: whereabouts
spec:
 replicas: 1
 selector:
  matchLabels:
   run: multustest2
 template:
  metadata:
   namespace: whereabouts
   labels:
    run: multustest2
   annotations:
    k8s.v1.cni.cncf.io/networks: whereabouts-nad2
  spec:
   containers:
   - image: busybox:latest
     name: multustest2
     command: [ "/bin/sh", "-c", "while true; do date; sleep 10; done"]
     securityContext:
       runAsNonRoot: true
       runAsUser: 1000
```

List the pods (oc get pods -w), you'll show one pod running.

Delete that pod. (oc delete pod multustest2-594975985d-clp6z) for example.

Actual results:

That pod is stuck in a ContainerCreating status, indefinitely.


Expected results:

Pod is recreated normally after it's deleted.

Comment 6 Weibin Liang 2021-02-01 21:03:31 UTC

Tested and verified in 4.6.0-0.nightly-2021-01-30-211400

Comment 8 errata-xmlrpc 2021-02-08 13:50:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0308

Note You need to log in before you can comment on or make changes to this bug.