Bug 1297671

Summary: Container will keep re-creating when it was failed to be created by conflicted with node resource
Product: OpenShift Container Platform Reporter: Meng Bo <bmeng>
Component: NodeAssignee: Derek Carr <decarr>
Status: CLOSED WONTFIX QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: low    
Version: 3.1.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-02 17:54:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Meng Bo 2016-01-12 07:48:26 UTC
Description of problem:
Try to create a pod which will map a bound node port. eg, udp 4789
When describing the pod, you can find it is keep trying to create the container after bind address failure.


Version-Release number of selected component (if applicable):
openshift v3.1.1.1
kubernetes v1.1.0-origin-1107-g4c8e6f4


How reproducible:
always

Steps to Reproduce:
1. Modify SCC to make the user can use host network
2. Try to create a pod which will map a bound node port
{
  "kind": "Pod",
  "apiVersion":"v1",
  "metadata": {
        "name": "hello-pod",
        "labels": {
                "name": "hello-pod"
        }
  },
  "spec": {
      "containers": [{
        "name": "hello-pod",
        "image": "bmeng/hello-openshift",
        "ports": [
          {
            "hostPort": 4789,
            "containerPort": 8080,
            "protocol": "UDP"
          }
        ]
      }],
        "restartPolicy": "Never"
  }
}

3. Check the pod status
$ oc get po
$ oc describe po

Actual results:
Pod is in ContainerCreating status and keep trying to create the pod after failure.

$ oc get po
NAME        READY     STATUS              RESTARTS   AGE
hello-pod   0/1       ContainerCreating   0          11m

$ oc describe po hello-pod
.....
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath                           Reason  Message
  ─────────     ────────        ─────   ────                            ─────────────                           ──────  ───────
  6m            6m              1       {kubelet node2.bmeng.local}     implicitly required container POD       Created Created with docker id 23a16732a558
  6m            6m              1       {kubelet node2.bmeng.local}     implicitly required container POD       Failed  Failed to start with docker id 23a16732a558 with error: API error (500): Cannot start container 23a16732a558e92937039e4b578acd01e33d715cb99f93d3eb9c420dc49e2db8: Error starting userland proxy: listen udp 0.0.0.0:4789: bind: address already in use

  6m    6m      1       {kubelet node2.bmeng.local}             FailedSync      Error syncing pod, skipping: API error (500): Cannot start container 23a16732a558e92937039e4b578acd01e33d715cb99f93d3eb9c420dc49e2db8: Error starting userland proxy: listen udp 0.0.0.0:4789: bind: address already in use

  6m    6m      1       {kubelet node2.bmeng.local}     implicitly required container POD       Created Created with docker id 06b973a99093
  6m    6m      1       {kubelet node2.bmeng.local}     implicitly required container POD       Failed  Failed to start with docker id 06b973a99093 with error: API error (500): Cannot start container 06b973a990931b8c3667ae5c0aa39f654584f1ae85b3b758e795f2bac87e2106: Error starting userland proxy: listen udp 0.0.0.0:4789: bind: address already in use

  6m    6m      1       {kubelet node2.bmeng.local}             FailedSync      Error syncing pod, skipping: API error (500): Cannot start container 06b973a990931b8c3667ae5c0aa39f654584f1ae85b3b758e795f2bac87e2106: Error starting userland proxy: listen udp 0.0.0.0:4789: bind: address already in use
.....
.....

$ oc describe po hello-pod | wc -l
393


Expected results:
Should not infinite retrying.

Additional info:

Comment 1 Andy Goldstein 2016-01-12 14:28:05 UTC
Not a 3.1.1 blocker. The user can describe the pod and see why it's failing.

We have discussed the possibility of trying to create containers a finite number of times, but that hasn't been implemented yet.

Comment 2 Derek Carr 2016-02-03 14:59:33 UTC
There is no ability to evict a pod when its containers crash loop at this time.

There is a future card discussing eviction of crash looped containers here:

https://trello.com/c/yRAkX3rW/247-5-pod-eviction-of-crash-looped-containers

Marking this upcoming release.

Comment 3 Derek Carr 2016-06-02 17:54:45 UTC
This is working as designed, and we do not plan to fix this.