Bug 1641814 - [free-int] prometheus operator repeatedly logging: updating statefulset failed
Summary: [free-int] prometheus operator repeatedly logging: updating statefulset failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.0
Assignee: Frederic Branczyk
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-22 20:48 UTC by Justin Pierce
Modified: 2019-06-04 10:40 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:40:48 UTC
Target Upstream Version:


Attachments (Terms of Use)
statefulset content (8.97 KB, text/plain)
2018-10-22 20:51 UTC, Justin Pierce
no flags Details
prometheus-operator deployment file (2.41 KB, text/plain)
2018-10-23 02:41 UTC, Junqi Zhao
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:40:56 UTC

Description Justin Pierce 2018-10-22 20:48:00 UTC
Description of problem:

$  oc edit pod prometheus-operator-579779cd5c-n8jqt
E1022 20:36:04.786726       1 operator.go:278] Sync "openshift-monitoring/main" failed: updating statefulset failed: StatefulSet.apps "alertmanager-main" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.
level=info ts=2018-10-22T20:36:21.018246178Z caller=operator.go:402 component=alertmanageroperator msg="sync alertmanager" key=openshift-monitoring/main
E1022 20:36:21.041111       1 operator.go:278] Sync "openshift-monitoring/main" failed: updating statefulset failed: StatefulSet.apps "alertmanager-main" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.
level=info ts=2018-10-22T20:36:47.988566504Z caller=operator.go:402 component=alertmanageroperator msg="sync alertmanager" key=openshift-monitoring/main
E1022 20:36:48.008604       1 operator.go:278] Sync "openshift-monitoring/main" failed: updating statefulset failed: StatefulSet.apps "alertmanager-main" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.
level=info ts=2018-10-22T20:37:16.90989681Z caller=operator.go:402 component=alertmanageroperator msg="sync alertmanager" key=openshift-monitoring/main
E1022 20:37:16.924442       1 operator.go:278] Sync "openshift-monitoring/main" failed: updating statefulset failed: StatefulSet.apps "alertmanager-main" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.
level=info ts=2018-10-22T20:38:29.030985292Z caller=operator.go:402 component=alertmanageroperator msg="sync alertmanager" key=openshift-monitoring/main
E1022 20:38:29.049614       1 operator.go:278] Sync "openshift-monitoring/main" failed: updating statefulset failed: StatefulSet.apps "alertmanager-main" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden.


Version-Release number of selected component (if applicable):
v3.11.16

Comment 1 Justin Pierce 2018-10-22 20:51:21 UTC
Created attachment 1496505 [details]
statefulset content

Comment 2 Junqi Zhao 2018-10-23 01:56:22 UTC
There is another error "unable to retrieve auth token: invalid username/password" 
# oc -n openshift-monitoring describe pod node-exporter-vdj4z
Events:
  Type     Reason   Age              From                                    Message
  ----     ------   ----             ----                                    -------
  Warning  Failed   1m (x2 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Error: ErrImagePull
  Normal   BackOff  1m (x2 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Back-off pulling image "registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v3.11.28"
  Warning  Failed   1m (x2 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Error: ImagePullBackOff
  Normal   BackOff  1m (x2 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Back-off pulling image "registry.reg-aws.openshift.com:443/openshift3/ose-kube-rbac-proxy:v3.11.28"
  Warning  Failed   1m (x2 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Error: ImagePullBackOff
  Warning  Failed   1m (x3 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Failed to pull image "registry.reg-aws.openshift.com:443/openshift3/ose-kube-rbac-proxy:v3.11.28": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password
  Normal   Pulling  1m (x3 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  pulling image "registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v3.11.28"
  Warning  Failed   1m (x3 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Failed to pull image "registry.reg-aws.openshift.com:443/openshift3/prometheus-node-exporter:v3.11.28": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password
  Warning  Failed   1m (x3 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  Error: ErrImagePull
  Normal   Pulling  1m (x3 over 1m)  kubelet, ip-172-31-49-167.ec2.internal  pulling image "registry.reg-aws.openshift.com:443/openshift3/ose-kube-rbac-proxy:v3.11.28"
**********************************************************************
# oc -n openshift-monitoring get pod -o wide | grep node-exporter-vdj4z
node-exporter-vdj4z                           0/2       ImagePullBackOff   0          4m        172.31.49.167   ip-172-31-49-167.ec2.internal   <none>

Comment 3 Junqi Zhao 2018-10-23 02:10:13 UTC
find in the attached "statefulset content", I see
registry.reg-aws.openshift.com:443/openshift3/ose-configmap-reloader:v3.11.7
registry.reg-aws.openshift.com:443/openshift3/ose-prometheus-config-reloader:v3.11.7

other version is v3.11.0-0.21.0

Comment 4 Junqi Zhao 2018-10-23 02:40:53 UTC
From
# oc -n openshift-monitoring get deployment.apps/prometheus-operator -oyaml

        - --config-reloader-image=registry.reg-aws.openshift.com:443/openshift3/ose-configmap-reloader:v3.11.28
        - --prometheus-config-reloader=registry.reg-aws.openshift.com:443/openshift3/ose-prometheus-config-reloader:v3.11.28

maybe it is the version issue caused the problem, we should make sure it uses the same version

Comment 5 Junqi Zhao 2018-10-23 02:41:31 UTC
Created attachment 1496593 [details]
prometheus-operator deployment file

Comment 6 Junqi Zhao 2018-10-24 05:55:03 UTC
see https://github.com/kubernetes/kubernetes/issues/66137

Comment 7 Frederic Branczyk 2018-10-24 20:04:33 UTC
This has been fixed in a newer version of the Prometheus Operator, so we should probably bump the version in the 3.11 release. For now what you can do is just delete the underlying StatefulSet.

Comment 12 Junqi Zhao 2019-01-27 10:32:09 UTC
Change back to MODIFIED, since free-int environment is still v3.11.69

Comment 15 Junqi Zhao 2019-04-12 09:23:21 UTC
$  oc -n openshift-monitoring logs $(oc -n openshift-monitoring get pod | grep prometheus-operator | awk '{print $1}') | grep Forbidden
nothing returned

no such issue now, payload: 4.0.0-0.nightly-2019-04-10-182914

Comment 17 errata-xmlrpc 2019-06-04 10:40:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.