Description of problem: Events provide timelines when failures happen and help debugging problems. A concrete example is CNI "network not ready" that will cause containers to be stucked in pending when the container network is down (SDN). In case of kube-apiserver-operator it can cause the installer pods being stucked and the operator will be effectively blocked. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I cannot verify this on 4.2.0-0.nightly-2019-11-25-200935. I have one degraded operator. $ oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.nightly-2019-11-25-200935 True False True 31m ... But insights operator doesn't collect events from it. $ oc rsh insights-operator-5f8db86747-lttn4 $ ls /var/lib/insights-operator/ $ tar -xzvf /var/lib/insights-operator/insights-2019-11-26-084959.tar.gz config/authentication config/clusteroperator/authentication config/clusteroperator/cloud-credential config/clusteroperator/cluster-autoscaler config/clusteroperator/console config/clusteroperator/dns config/clusteroperator/image-registry config/clusteroperator/ingress config/clusteroperator/insights config/clusteroperator/kube-apiserver config/clusteroperator/kube-controller-manager config/clusteroperator/kube-scheduler config/clusteroperator/machine-api config/clusteroperator/machine-config config/clusteroperator/marketplace config/clusteroperator/monitoring config/clusteroperator/network config/clusteroperator/node-tuning config/clusteroperator/openshift-apiserver config/clusteroperator/openshift-controller-manager config/clusteroperator/openshift-samples config/clusteroperator/operator-lifecycle-manager config/clusteroperator/operator-lifecycle-manager-catalog config/clusteroperator/operator-lifecycle-manager-packageserver config/clusteroperator/service-ca config/clusteroperator/service-catalog-apiserver config/clusteroperator/service-catalog-controller-manager config/clusteroperator/storage config/featuregate config/id config/infrastructure config/ingress config/network config/oauth config/version
Verified in 4.2.0-0.nightly-2019-12-02-165545 Verification steps: 1. Create locally patch.yaml with the following contents: - op: add path: /spec/overrides value: - group: apps/v1 kind: Deployment name: ingress-operator namespace: openshift-ingress-operator unmanaged: true 2. oc patch clusterversion version --type json -p "$(cat patch.yaml)" 3. Scale the ingress operator to 0 in the web-console 4. Scale the openshift-ingress to 0 in the web-console 5. oc delete pods --all -n openshift-authentication-operator (that will rekick auth operator so you don't have to wait) oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.nightly-2019-12-02-165545 True False True 82m 6. oc delete pods --all -n openshift-insights (that will kick insights operator) 7. Check insights-archive, there is a folder with recorded events: $oc project openshift-insights $oc get pods -n openshift-insights NAME READY STATUS RESTARTS AGE insights-operator-5dbd8b898d-pdgw7 1/1 Running 0 27s $oc rsh insights-operator-5dbd8b898d-pdgw7 #tar -xzvf /var/lib/insights-operator/insights-2019-12-03-100819.tar.gz config/authentication config/clusteroperator/authentication config/clusteroperator/cloud-credential config/clusteroperator/cluster-autoscaler config/clusteroperator/console config/clusteroperator/dns config/clusteroperator/image-registry config/clusteroperator/ingress config/clusteroperator/insights config/clusteroperator/kube-apiserver config/clusteroperator/kube-controller-manager config/clusteroperator/kube-scheduler config/clusteroperator/machine-api config/clusteroperator/machine-config config/clusteroperator/marketplace config/clusteroperator/monitoring config/clusteroperator/network config/clusteroperator/node-tuning config/clusteroperator/openshift-apiserver config/clusteroperator/openshift-controller-manager config/clusteroperator/openshift-samples config/clusteroperator/operator-lifecycle-manager config/clusteroperator/operator-lifecycle-manager-catalog config/clusteroperator/operator-lifecycle-manager-packageserver config/clusteroperator/service-ca config/clusteroperator/service-catalog-apiserver config/clusteroperator/service-catalog-controller-manager config/clusteroperator/storage config/featuregate config/id config/infrastructure config/ingress config/network config/oauth config/version events/openshift-authentication events/openshift-authentication-operator events/openshift-config events/openshift-config-managed events/openshift-ingress # cat events/openshift-ingress {"items":[{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:08Z","reason":"Killing","message":"Stopping container router"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:08Z","reason":"SuccessfulDelete","message":"Deleted pod: router-default-7dbd7cbb94-f287d"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:08Z","reason":"ScalingReplicaSet","message":"Scaled down replica set router-default-7dbd7cbb94 to 1"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:10Z","reason":"Killing","message":"Stopping container router"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:10Z","reason":"SuccessfulDelete","message":"Deleted pod: router-default-7dbd7cbb94-n98gx"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:02:10Z","reason":"ScalingReplicaSet","message":"Scaled down replica set router-default-7dbd7cbb94 to 0"},{"namespace":"openshift-ingress","lastTimestamp":"2019-12-03T10:07:51Z","reason":"NoPods","message":"No matching pods found"}]} # cat events/openshift-authentication {"items":[{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:05:51Z","reason":"SuccessfulCreate","message":"Created pod: oauth-openshift-c4d5fdf98-j7p8k"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:05:51Z","reason":"Scheduled","message":"Successfully assigned openshift-authentication/oauth-openshift-c4d5fdf98-j7p8k to ip-10-0-128-110.ec2.internal"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:05:51Z","reason":"ScalingReplicaSet","message":"Scaled up replica set oauth-openshift-c4d5fdf98 to 1"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:05:59Z","reason":"Pulled","message":"Container image \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9eb33aaf5c732e0967454e6861cf10d0a3323fbbf2962da7e1d450b15b59a364\" already present on machine"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:05:59Z","reason":"Created","message":"Created container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:00Z","reason":"Started","message":"Started container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"SuccessfulCreate","message":"Created pod: oauth-openshift-c4d5fdf98-bj568"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"ScalingReplicaSet","message":"Scaled up replica set oauth-openshift-c4d5fdf98 to 2"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"SuccessfulDelete","message":"Deleted pod: oauth-openshift-6cf8b94896-cbdz2"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"ScalingReplicaSet","message":"Scaled down replica set oauth-openshift-6cf8b94896 to 1"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"Scheduled","message":"Successfully assigned openshift-authentication/oauth-openshift-c4d5fdf98-bj568 to ip-10-0-155-216.ec2.internal"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:04Z","reason":"Killing","message":"Stopping container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:12Z","reason":"Started","message":"Started container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:12Z","reason":"Created","message":"Created container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:12Z","reason":"Pulled","message":"Container image \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9eb33aaf5c732e0967454e6861cf10d0a3323fbbf2962da7e1d450b15b59a364\" already present on machine"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:19Z","reason":"SuccessfulDelete","message":"Deleted pod: oauth-openshift-6cf8b94896-mlxzl"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:19Z","reason":"Killing","message":"Stopping container oauth-openshift"},{"namespace":"openshift-authentication","lastTimestamp":"2019-12-03T10:06:19Z","reason":"ScalingReplicaSet","message":"Scaled down replica set oauth-openshift-6cf8b94896 to 0"}]}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:4093