Bug 1262753

Summary: Pod continue restart when deploy pod with failed even set restartPolicy to Never
Product: OKD Reporter: DeShuai Ma <dma>
Component: ContainersAssignee: Andy Goldstein <agoldste>
Status: CLOSED CURRENTRELEASE QA Contact: Chao Yang <chaoyang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.xCC: aos-bugs, dmace, dmcphers, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-23 21:13:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description DeShuai Ma 2015-09-14 09:26:47 UTC
Description of problem:
Create a pod with error command, set restartPolicy to "Never", when create pod fail, it still continue restart

Version-Release number of selected component (if applicable):
openshift v1.0.5-400-g9b11c48
kubernetes v1.1.0-alpha.0-1605-g44c91b1

How reproducible:
Always

Steps to Reproduce:
1.Create a pod
$ oc create pod.json
$ cat pod.json
{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "hello-openshift",
    "creationTimestamp": null,
    "labels": {
      "name": "hello-openshift"
    }
  },
  "spec": {
    "containers": [
      {
        "name": "hello-openshift",
        "image": "openshift/hello-openshift",
        "command": [
            "start",
            "hello-openshift"
        ],
        "ports": [
          {
            "containerPort": 8080,
            "protocol": "TCP"
          }
        ],
        "resources": {},
        "volumeMounts": [
          {
            "name":"tmp",
            "mountPath":"/tmp"
          }
        ],
        "terminationMessagePath": "/dev/termination-log",
        "imagePullPolicy": "IfNotPresent",
        "capabilities": {},
        "securityContext": {
          "capabilities": {},
          "privileged": false
        }
      }
    ],
    "volumes": [
      {
        "name":"tmp",
        "emptyDir": {}
      }
    ],
    "restartPolicy": "Never",
    "dnsPolicy": "ClusterFirst",
    "serviceAccount": ""
  },
  "status": {}
}

Actual results:
1.When the pod failed, it always restart.

Expected results:
1.Should not restart

Additional info:
http://fpaste.org/266841/22087814/

Comment 1 Dan Mace 2015-09-14 14:40:34 UTC
Reassigning the component since this isn't related to OpenShift deployments.

Comment 2 Andy Goldstein 2015-09-14 15:26:39 UTC
Running a Docker container is a 2-part process:

1) create container
2) start container

The problem in this case is that with 2), Docker is unable to start the container and it gives an error message like so:

Cannot start container ebd101cfcd0b1aab572a184ece01a17ce081658114977bd6490cd6c78aff06e3: [8] System error: exec: "start": executable file not found in $PATH

In this situation, because the container never started, it doesn't get an exit code, which is what the Kubelet is looking for to determine if it should restart things. The restart policy is about what happens after the container starts and then exits with a non-zero exit code; it's not currently about what happens if the container can't start successfully.

This is not a release blocker.

Comment 3 Andy Goldstein 2015-09-30 17:16:41 UTC
Upstream issue: https://github.com/kubernetes/kubernetes/issues/14491

Comment 4 Andy Goldstein 2015-10-09 18:00:15 UTC
Upstream PR: https://github.com/kubernetes/kubernetes/pull/15082

Comment 5 Andy Goldstein 2015-10-14 15:28:25 UTC
Merged upstream. Pending rebase to Origin.

Comment 6 Andy Goldstein 2015-10-20 00:17:38 UTC
In origin after the most recent rebase. Ready for QE.

Comment 7 DeShuai Ma 2015-10-20 08:41:02 UTC
Thanks Andy for solve this bug.It has been fixed on origin.

[fedora@ip-172-18-15-158 sample-app]$ openshift version
openshift v1.0.6-759-g403de38
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
etcd 2.1.2

[fedora@ip-172-18-15-158 sample-app]$ oadm new-project dma
Created project dma
[fedora@ip-172-18-15-158 sample-app]$ oc create -f pod.json -n dma
pod "hello-openshift" created
[fedora@ip-172-18-15-158 sample-app]$ oc get pod -n dma
NAME              READY     STATUS               RESTARTS   AGE
hello-openshift   0/1       ContainerCannotRun   0          2m