Description of problem: When a container in a catalog pod terminates, the logs for the terminated containers are not reported back to the catalog operator. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L105 the pod created here has terminationMessagePolicy: terminationMessageReadFile (the default policy) set, and should be set to terminationMessagePolicy: terminationMessageFallBackToLogsOnError instead.
Setting the priority as urgent since the logs are needed to investigate why catalog pods in the openshift-marketplace namespace are crashlooping around 20% of the time https://bugzilla.redhat.com/show_bug.cgi?id=1949991#c6
Moving back to ASSIGNED as the PR this BZ is tracking was merged against the upstream repository so QE has no way of validating these changes.
Cluster version is 4.8.0-0.nightly-2021-04-25-195440 [jzhang@dhcp-140-36 ~]$ oc -n openshift-operator-lifecycle-manager exec catalog-operator-7b6d5b8c8f-cxscr -- olm --version OLM version: 0.17.0 git commit: 9fa1f1249e3acc15b1f628d5f96e7b7047e9f176 [jzhang@dhcp-140-36 ~]$ oc project Using project "openshift-marketplace" on server "https://api.huirwang-0426a.qe.devcluster.openshift.com:6443". [jzhang@dhcp-140-36 ~]$ oc get pods NAME READY STATUS RESTARTS AGE certified-operators-fwcxv 1/1 Running 0 31m community-operators-mb2tz 1/1 Running 0 5h marketplace-operator-5d97446c8-wlv5z 1/1 Running 0 5h7m qe-app-registry-jnjk5 1/1 Running 0 5h5m redhat-marketplace-k9xts 1/1 Running 0 5h redhat-operators-x969k 1/1 Running 0 5h9m [jzhang@dhcp-140-36 ~]$ for l in `oc get pod|awk 'NR == 1 {next} {print $1}'`; do oc get pod $l -o=jsonpath={.spec.containers[0].terminationMessagePolicy}; echo "-$l"; done FallbackToLogsOnError-certified-operators-fwcxv FallbackToLogsOnError-community-operators-mb2tz File-marketplace-operator-5d97446c8-wlv5z FallbackToLogsOnError-qe-app-registry-jnjk5 FallbackToLogsOnError-redhat-marketplace-k9xts FallbackToLogsOnError-redhat-operators-x969k LGTM, all of the CatalogSources' pods use the "FallbackToLogsOnError" termination message policy now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438