Description of problem: When using below configuration it is observed that the first time pod gets killed after 5 unhealthy attempts of the probe but after getting killed it kills the successive pods after first unhealthy attempt itself. the liveness check as follows livenessProbe: httpGet: path: /dancertestt port: 3000 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 1 periodSeconds: 10 successThreshold: 1 failureThreshold: 5 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: failureThreshold is not being respected Expected results: failureThreshold should be considered every time before killing an existing pod Additional info:
Attempting to reproduce in kubernetes/master.
Kube PR: https://github.com/kubernetes/kubernetes/pull/46371
Origin PR: https://github.com/openshift/origin/pull/14332
*** Bug 1457399 has been marked as a duplicate of this bug. ***
verified on openshift v3.6.126 Fixed. pod-probe-fail.yaml apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: busybox command: - sleep - "3600" livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 3 timeoutSeconds: 1 periodSeconds: 3 successThreshold: 1 failureThreshold: 10 terminationGracePeriodSeconds: 0 # oc create -f pod-probe-fail.yaml pod "busybox" created # oc get pods -w NAME READY STATUS RESTARTS AGE busybox 1/1 Running 0 8s busybox 0/1 Error 0 39s busybox 1/1 Running 1 43s busybox 0/1 Error 1 1m busybox 0/1 CrashLoopBackOff 1 1m busybox 1/1 Running 2 1m busybox 0/1 Error 2 1m busybox 0/1 CrashLoopBackOff 2 2m busybox 1/1 Running 3 2m busybox 0/1 Error 3 2m busybox 0/1 CrashLoopBackOff 3 3m busybox 1/1 Running 4 3m busybox 0/1 Error 4 4m busybox 0/1 CrashLoopBackOff 4 4m busybox 1/1 Running 5 5m busybox 0/1 Error 5 6m busybox 0/1 CrashLoopBackOff 5 6m busybox 1/1 Running 6 9m busybox 0/1 Error 6 9m busybox 0/1 CrashLoopBackOff 6 9m # oc describe pod Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 4m 4m 1 default-scheduler Normal Scheduled Successfully assigned busybox to jialiu-node-zone1-primary-1 4m 57s 5 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Normal Pulling pulling image "busybox" 4m 55s 5 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Normal Pulled Successfully pulled image "busybox" 4m 54s 5 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Normal Created Created container 4m 54s 5 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Normal Started Started container 4m 26s 46 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Warning Unhealthy Liveness probe failed: Get http://10.2.6.83:8080/healthz: dial tcp 10.2.6.83:8080: getsockopt: connection refused 4m 24s 5 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Normal Killing Killing container with id docker://busybox:pod "busybox_wmeng1(3165a7dc-5be3-11e7-8638-42010af00004)" container "busybox" is unhealthy, it will be killed and re-created. 4m 12s 26 kubelet, jialiu-node-zone1-primary-1 Warning DNSSearchForming Found and omitted duplicated dns domain in host search line: 'cluster.local' during merging with cluster dns domains 4m 12s 14 kubelet, jialiu-node-zone1-primary-1 Warning FailedSync Error syncing pod 3m 12s 9 kubelet, jialiu-node-zone1-primary-1 spec.containers{busybox} Warning BackOff Back-off restarting failed container
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716