Hide Forgot
The openshift-controller-manager-operator pod does not provide a termination message, hindering debugging efforts when the pods are crash looping. At minimum, the pod's terminationMessagePolicy should be "FallbackToLogsOnError". See https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/#customizing-the-termination-message Expected Results: The termination message should appear in a pod container's .status.lastState.terminated.message field.
Fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1707061
(In reply to Luis Sanchez from comment #1) > Fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1707061 Wrong link: https://github.com/openshift/cluster-openshift-controller-manager-operator/pull/98
Tested in version: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-05-07-201043 True False 21m Cluster version is 4.1.0-0.nightly-2019-05-07-201043 payload:registry.svc.ci.openshift.org/ocp/release@sha256:41319d522be6f0c739c38f6699320360732ee94d7f669d0646459cfd867c9963 1. Now deployment already added FallbackToLogsOnError $ oc get deployment -o yaml -n openshift-controller-manager-operator |grep -i "terminationMessagePolicy" terminationMessagePolicy: FallbackToLogsOnError 2.Check the pod, status.lastState.terminated.message field had the message, but begin with "er -n openshift-controller-manager because it changed", what is "er"?, maybe should be "error" word.@Luis Sanchez could you help to confirm it,thanks $ oc get pods -o yaml -n openshift-controller-manager-operator lastState: terminated: containerID: cri-o://acf9ade72f4486756a181783061d6856ae68603d2ccd85cd21819bdb6e843a4e exitCode: 255 finishedAt: "2019-05-08T01:47:46Z" message: | er -n openshift-controller-manager because it changed I0508 01:47:28.779571 1 status_controller.go:160] clusteroperator/openshift-controller-manager diff {"status":{"conditions":[{"lastTransitionTime":"2019-05-08T01:46:28Z","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2019-05-08T01:46:33Z","message":"Progressing: daemonset/controller-manager: observed generation is 6, desired generation is 7.","reason":"ProgressingDesiredStateNotYetAchieved","status":"True","type":"Progressing"},{"lastTransitionTime":"2019-05-08T01:46:33Z","message":"Available: no daemon pods available on any node.","reason":"AvailableNoPodsAvailable","status":"False","type":"Available"},{"lastTransitionTime":"2019-05-08T01:46:28Z","reason":"NoData","status":"Unknown","type":"Upgradeable"}]}} I0508 01:47:28.794513 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-controller-manager-operator", Name:"openshift-controller-manager-operator", UID:"fda5aabd-7132-11e9-bab6-025511648778", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for operator openshift-controller-manager changed: Progressing message changed from "Progressing: daemonset/controller-manager: observed generation is 5, desired generation is 6." to "Progressing: daemonset/controller-manager: observed generation is 6, desired generation is 7." I0508 01:47:46.398872 1 observer_polling.go:78] Observed change: file:/var/run/secrets/serving-cert/tls.key (current: "9aa132c772c9c7c3049c798e2e975534c716d5dcb2f759f0a457f114fa31222c", lastKnown: "") W0508 01:47:46.398907 1 builder.go:108] Restart triggered because of file /var/run/secrets/serving-cert/tls.key was created F0508 01:47:46.398965 1 leaderelection.go:65] leaderelection lost I0508 01:47:46.400265 1 observer_polling.go:78] Observed change: file:/var/run/secrets/serving-cert/tls.crt (current: "3014a8d059742d8afc5be688ce0d2f2a5b563770adcbf41db36fb8a09e40f13e", lastKnown: "") reason: Error
Above error seems no related to this bug(no such error in a newer version), since we can see the policy as below, will verify this bug now: $ oc get deployment -o yaml -n openshift-controller-manager-operator |grep -i "terminationMessagePolicy" terminationMessagePolicy: FallbackToLogsOnError $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-05-07-233329 True False 5h18m Cluster version is 4.1.0-0.nightly-2019-05-07-233329
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758