Bug 1809375 - [3.11] - prometheus pods keep restarting without any error
Summary: [3.11] - prometheus pods keep restarting without any error
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Pawel Krupa
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-03 00:02 UTC by Vladislav Walek
Modified: 2023-09-07 22:09 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-17 12:40:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vladislav Walek 2020-03-03 00:02:27 UTC
Description of problem:

the prometheus-k8s pods keep restarting even there is no error or reason why it should be restarted.
In controller logs and events you see that it was just restarted.

6m          71d          57273     prometheus-k8s.15e1cf19e3ae72b2     StatefulSet                                                 Normal    SuccessfulDelete   statefulset-controller   delete Pod prometheus-k8s-1 in StatefulSet prometheus-k8s successful


Controller logs:

I0302 10:13:41.387247       1 event.go:221] Event(v1.ObjectReference{Kind:"StatefulSet", Namespace:"openshift-monitoring", Name:"prometheus-k8s", UID:"c01954d3-0d5f-11ea-9202-42010ac3700a", APIVersion:"apps/v1", ResourceVersion:"31762692", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' create Pod prometheus-k8s-1 in StatefulSet prometheus-k8s successful

I0302 10:14:39.193354       1 stateful_set_control.go:509] StatefulSet openshift-monitoring/prometheus-k8s terminating Pod prometheus-k8s-1 for update

I0302 10:14:39.221017       1 event.go:221] Event(v1.ObjectReference{Kind:"StatefulSet", Namespace:"openshift-monitoring", Name:"prometheus-k8s", UID:"c01954d3-0d5f-11ea-9202-42010ac3700a", APIVersion:"apps/v1", ResourceVersion:"31762752", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' delete Pod prometheus-k8s-1 in StatefulSet prometheus-k8s successful


# oc get controllerrevision -n openshift-monitoring

NAME                          CONTROLLER                           REVISION   AGE
alertmanager-main-f7b649b6d   statefulset.apps/alertmanager-main   1          101d
node-exporter-7d547c4887      daemonset.apps/node-exporter         1          101d
prometheus-k8s-5b98789d78     statefulset.apps/prometheus-k8s      2          74d
prometheus-k8s-5b98789d79     statefulset.apps/prometheus-k8s      2          74d
prometheus-k8s-7577785cd5     statefulset.apps/prometheus-k8s      1          101d

# oc describe pod/prometheus-k8s-0  -n openshift-monitoring

Name:               prometheus-k8s-0
Namespace:          openshift-monitoring
Priority:           0
PriorityClassName:  <none>
Node:              <REDACTED>
Start Time:         Mon, 02 Mar 2020 14:24:51 -0500
Labels:             app=prometheus
                    controller-revision-hash=prometheus-k8s-5b98789d79
                    prometheus=k8s
                    statefulset.kubernetes.io/pod-name=prometheus-k8s-0
Annotations:        openshift.io/scc=restricted
Status:             Running
IP:                 <REDACTED>
Controlled By:      StatefulSet/prometheus-k8s
......

# oc describe pod/prometheus-k8s-1  -n openshift-monitoring
Name:               prometheus-k8s-1
Namespace:          openshift-monitoring
Priority:           0
PriorityClassName:  <none>
Node:               <REDACTED>
Start Time:         Mon, 02 Mar 2020 14:49:03 -0500
Labels:             app=prometheus
                    controller-revision-hash=prometheus-k8s-5b98789d79
                    prometheus=k8s
                    statefulset.kubernetes.io/pod-name=prometheus-k8s-1
Annotations:        openshift.io/scc=restricted
Status:             Running
IP:                 <REDACTED>
Controlled By:      StatefulSet/prometheus-k8s


Version-Release number of selected component (if applicable):
OpenShift Container Platfrom 3.11

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 14 Pawel Krupa 2020-05-26 08:20:32 UTC
@Vladislav Could you show logs for prometheus pod (for `prometheus` and `configmap-reloader` containers)

Comment 16 Pawel Krupa 2020-06-17 12:40:44 UTC
Closing due to lack of response from the reporter. Please reopen if this is still valid.


Note You need to log in before you can comment on or make changes to this bug.