Description of problem: Prometheus faiks to start after recent upgrade. Below logs are seen in the logs : oc get pods -n openshift-monitoring -l app=prometheus -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES prometheus-k8s-0 6/7 CreateContainerError 0 2d19h 192.171.2.22 ocp03infra03.example.com <none> <none> prometheus-k8s-1 7/7 Running 0 22h 192.168.2.160 ocp03infra02.example.com <none> <none> Jan 19 14:21:09 ocp03infra03.example.com crio[276176]: time="2021-01-19 14:21:09.351368767Z" level=info msg="CreateCtr: releasing container name k8s_prometheus_prometheus-k8s-0_openshift-monitoring_0e545ec3-3569-49b1-8ca1-ca765896d77c_0" file="server/container_create.go:565" id=4f0ac1e1-ccbd-425c-869b-0253757358dd name=/runtime.v1alpha2.RuntimeService/CreateContainer Jan 19 14:21:09 ocp03infra03.ocp03.smbcgroup.com crio[276176]: time="2021-01-19 14:21:09.351494781Z" level=debug msg="Response error: container create failed: time=\"2021-01-19T14:21:09Z\" level=error msg=\"container_linux.go:348: starting container process caused \\\"exec: \\\\\\\"/bin/prometheus\\\\\\\": stat /bin/prometheus: no such file or directory\\\"\"\ncontainer_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"\n" file="go-grpc-middleware/chain.go:25" id=4f0ac1e1-ccbd-425c-869b-0253757358dd name=/runtime.v1alpha2.RuntimeService/CreateContainer Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: E0119 14:21:09.351894 3730250 remote_runtime.go:200] CreateContainer in sandbox "83dd41b197ab281618e6a34b052a7ea689c74c60193e2959f988b7751b110220" from runtime service failed: rpc error: code = Unknown desc = container create failed: time="2021-01-19T14:21:09Z" level=error msg="container_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"" Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: container_linux.go:348: starting container process caused "exec: \"/bin/prometheus\": stat /bin/prometheus: no such file or directory" Jan 19 14:21:09 ocp03infra03.ocp03.smbcgroup.com hyperkube[3730250]: E0119 14:21:09.352016 3730250 kuberuntime_manager.go:801] container start failed: CreateContainerError: container create failed: time="2021-01-19T14:21:09Z" level=error msg="container_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"" Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: container_linux.go:348: starting container process caused "exec: \"/bin/prometheus\": stat /bin/prometheus: no such file or directory" Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: E0119 14:21:09.352075 3730250 pod_workers.go:191] Error syncing pod 0e545ec3-3569-49b1-8ca1-ca765896d77c ("prometheus-k8s-0_openshift-monitoring(0e545ec3-3569-49b1-8ca1-ca765896d77c)"), skipping: failed to "StartContainer" for "prometheus" with CreateContainerError: "container create failed: time=\"2021-01-19T14:21:09Z\" level=error msg=\"container_linux.go:348: starting container process caused \\\"exec: \\\\\\\"/bin/prometheus\\\\\\\": stat /bin/prometheus: no such file or directory\\\"\"\ncontainer_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"\n" Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Prometheus fails to start Expected results: Prometheus should be running after upgrade Additional info:
as per https://bugzilla.redhat.com/show_bug.cgi?id=1918126#c12 I am closing this
for posterity, I am updating this because I actually suspect it's due to the attached bug *** This bug has been marked as a duplicate of bug 1942536 ***
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days