Bug 1918126 - Prometheus in CreateContainerError state after upgrade to 4.5.11
Summary: Prometheus in CreateContainerError state after upgrade to 4.5.11
Keywords:
Status: CLOSED DUPLICATE of bug 1942536
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.8.0
Assignee: Peter Hunt
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-20 06:42 UTC by Jaspreet Kaur
Modified: 2024-03-25 17:54 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-03 20:31:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jaspreet Kaur 2021-01-20 06:42:58 UTC
Description of problem: Prometheus faiks to start after recent upgrade. Below logs are seen in the logs :


oc get pods -n openshift-monitoring -l app=prometheus -o wide
NAME               READY   STATUS                 RESTARTS   AGE     IP              NODE                               NOMINATED NODE   READINESS GATES
prometheus-k8s-0   6/7     CreateContainerError   0          2d19h   192.171.2.22    ocp03infra03.example.com   <none>           <none>
prometheus-k8s-1   7/7     Running                0          22h     192.168.2.160   ocp03infra02.example.com   <none>           <none>

Jan 19 14:21:09 ocp03infra03.example.com crio[276176]: time="2021-01-19 14:21:09.351368767Z" level=info msg="CreateCtr: releasing container name k8s_prometheus_prometheus-k8s-0_openshift-monitoring_0e545ec3-3569-49b1-8ca1-ca765896d77c_0" file="server/container_create.go:565" id=4f0ac1e1-ccbd-425c-869b-0253757358dd name=/runtime.v1alpha2.RuntimeService/CreateContainer
Jan 19 14:21:09 ocp03infra03.ocp03.smbcgroup.com crio[276176]: time="2021-01-19 14:21:09.351494781Z" level=debug msg="Response error: container create failed: time=\"2021-01-19T14:21:09Z\" level=error msg=\"container_linux.go:348: starting container process caused \\\"exec: \\\\\\\"/bin/prometheus\\\\\\\": stat /bin/prometheus: no such file or directory\\\"\"\ncontainer_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"\n" file="go-grpc-middleware/chain.go:25" id=4f0ac1e1-ccbd-425c-869b-0253757358dd name=/runtime.v1alpha2.RuntimeService/CreateContainer
Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: E0119 14:21:09.351894 3730250 remote_runtime.go:200] CreateContainer in sandbox "83dd41b197ab281618e6a34b052a7ea689c74c60193e2959f988b7751b110220" from runtime service failed: rpc error: code = Unknown desc = container create failed: time="2021-01-19T14:21:09Z" level=error msg="container_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\""
Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: container_linux.go:348: starting container process caused "exec: \"/bin/prometheus\": stat /bin/prometheus: no such file or directory"
Jan 19 14:21:09 ocp03infra03.ocp03.smbcgroup.com hyperkube[3730250]: E0119 14:21:09.352016 3730250 kuberuntime_manager.go:801] container start failed: CreateContainerError: container create failed: time="2021-01-19T14:21:09Z" level=error msg="container_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\""
Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: container_linux.go:348: starting container process caused "exec: \"/bin/prometheus\": stat /bin/prometheus: no such file or directory"
Jan 19 14:21:09 ocp03infra03.example.com hyperkube[3730250]: E0119 14:21:09.352075 3730250 pod_workers.go:191] Error syncing pod 0e545ec3-3569-49b1-8ca1-ca765896d77c ("prometheus-k8s-0_openshift-monitoring(0e545ec3-3569-49b1-8ca1-ca765896d77c)"), skipping: failed to "StartContainer" for "prometheus" with CreateContainerError: "container create failed: time=\"2021-01-19T14:21:09Z\" level=error msg=\"container_linux.go:348: starting container process caused \\\"exec: \\\\\\\"/bin/prometheus\\\\\\\": stat /bin/prometheus: no such file or directory\\\"\"\ncontainer_linux.go:348: starting container process caused \"exec: \\\"/bin/prometheus\\\": stat /bin/prometheus: no such file or directory\"\n"


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results: Prometheus fails to start


Expected results: Prometheus should be running after upgrade


Additional info:

Comment 14 Peter Hunt 2021-03-03 20:31:06 UTC
as per https://bugzilla.redhat.com/show_bug.cgi?id=1918126#c12 I am closing this

Comment 15 Peter Hunt 2021-04-16 20:13:59 UTC
for posterity, I am updating this because I actually suspect it's due to the attached bug

*** This bug has been marked as a duplicate of bug 1942536 ***

Comment 16 Red Hat Bugzilla 2023-09-15 00:58:39 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.