Bug 1898672

Summary: Pod gets stuck in ContainerCreating state with exhausted Whereabouts IPAM range with a daemonset
Product: OpenShift Container Platform Reporter: Douglas Smith <dosmith>
Component: NetworkingAssignee: Douglas Smith <dosmith>
Networking sub component: multus QA Contact: Weibin Liang <weliang>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: weliang, zzhao
Version: 4.6Keywords: UpcomingSprint
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1898670
: 1898675 (view as bug list) Environment:
Last Closed: 2021-02-08 13:50:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1898613    
Bug Blocks: 1898675    

Description Douglas Smith 2020-11-17 19:09:49 UTC
+++ This bug was initially created as a clone of Bug #1898670 +++

Description of problem: When using a daemonset and a Whereabouts range gets exhausted, a pod can get stuck in a containercreating state indefinitely.

See: https://github.com/intel/multus-cni/issues/578


How reproducible: Always.


Steps to Reproduce:

Create a net-attach-def:

```
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
 name: whereabouts-nad2
 namespace: whereabouts
spec:
 config: '{
  "cniVersion": "0.3.1",
  "type": "macvlan",
  "master": "eth0",
  "ipam": {
   "type": "whereabouts",
   "range": "10.10.10.0/24",
   "range_start": "10.10.10.100",
   "range_end": "10.10.10.128",
   "exclude": ["10.10.10.0/25"]
  }
 }'
```

Create a deployment:

```
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
  run: multustest2
 name: multustest2
 namespace: whereabouts
spec:
 replicas: 1
 selector:
  matchLabels:
   run: multustest2
 template:
  metadata:
   namespace: whereabouts
   labels:
    run: multustest2
   annotations:
    k8s.v1.cni.cncf.io/networks: whereabouts-nad2
  spec:
   containers:
   - image: busybox:latest
     name: multustest2
     command: [ "/bin/sh", "-c", "while true; do date; sleep 10; done"]
     securityContext:
       runAsNonRoot: true
       runAsUser: 1000
```

List the pods (oc get pods -w), you'll show one pod running.

Delete that pod. (oc delete pod multustest2-594975985d-clp6z) for example.

Actual results:

That pod is stuck in a ContainerCreating status, indefinitely.


Expected results:

Pod is recreated normally after it's deleted.

Comment 6 Weibin Liang 2021-02-01 21:03:31 UTC
Tested and verified in 4.6.0-0.nightly-2021-01-30-211400

Comment 8 errata-xmlrpc 2021-02-08 13:50:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0308