Description of problem: Insights operator collects pod manifests, related to failing operator. It should also collect current/previous pod logs (at least latest ~100 lines) of all containers in that pod to diagnose the failure better without requiring a must-gather
Fixed and verified in 4.6.0-0.nightly-2020-06-04-232426. Verification steps: 1. Degrade some operator oc -n openshift-monitoring create configmap cluster-monitoring-config oc -n openshift-monitoring edit configmap cluster-monitoring-config Add the following data to the config (with invalid value): apiVersion: v1 data: config.yaml: | telemeterClient: enabled: NOT_BOOELAN kind: ConfigMap metadata: ... Delete cluster-monitoring-operator* pod oc get pods -n openshift-monitoring oc delete pod cluster-monitoring-operator-7b8665747f-w2fwv -n openshift-monitoring 2. Check operator is degraded $ oc get co monitoring NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE monitoring 4.6.0-0.nightly-2020-06-04-232426 False False True 51s 3. Download fresh archive from AWS S3 4. Check the content of the archive $ ll -R openshift-monitoring/logs/ openshift-monitoring/logs/: total 8 drwxr-xr-x. 2 anikifor anikifor 4096 Jun 5 13:58 prometheus-k8s-0 drwxr-xr-x. 2 anikifor anikifor 4096 Jun 5 13:58 prometheus-k8s-1 openshift-monitoring/logs/prometheus-k8s-0: total 52 -rw-r-----. 1 anikifor anikifor 512 Jun 5 13:56 kube-rbac-proxy_current.log -rw-r-----. 1 anikifor anikifor 1036 Jun 5 13:56 prometheus-config-reloader_current.log -rw-r-----. 1 anikifor anikifor 16734 Jun 5 13:56 prometheus_current.log -rw-r-----. 1 anikifor anikifor 2788 Jun 5 13:56 prometheus_previous.log -rw-r-----. 1 anikifor anikifor 4549 Jun 5 13:56 prometheus-proxy_current.log -rw-r-----. 1 anikifor anikifor 59 Jun 5 13:56 prom-label-proxy_current.log -rw-r-----. 1 anikifor anikifor 180 Jun 5 13:56 rules-configmap-reloader_current.log -rw-r-----. 1 anikifor anikifor 2392 Jun 5 13:56 thanos-sidecar_current.log openshift-monitoring/logs/prometheus-k8s-1: total 68 -rw-r-----. 1 anikifor anikifor 512 Jun 5 13:56 kube-rbac-proxy_current.log -rw-r-----. 1 anikifor anikifor 1035 Jun 5 13:56 prometheus-config-reloader_current.log -rw-r-----. 1 anikifor anikifor 38134 Jun 5 13:56 prometheus_current.log -rw-r-----. 1 anikifor anikifor 2788 Jun 5 13:56 prometheus_previous.log -rw-r-----. 1 anikifor anikifor 1587 Jun 5 13:56 prometheus-proxy_current.log -rw-r-----. 1 anikifor anikifor 59 Jun 5 13:56 prom-label-proxy_current.log -rw-r-----. 1 anikifor anikifor 180 Jun 5 13:56 rules-configmap-reloader_current.log -rw-r-----. 1 anikifor anikifor 2392 Jun 5 13:56 thanos-sidecar_current.log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196