Bug 1703032 - etcd monitoring configuration is completely getting reset when performing an minor upgrade from v3.11.88 to v3.11.98
Summary: etcd monitoring configuration is completely getting reset when performing an ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.11.z
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard: groom
: 1748871 1772729 1772948 1839179 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-25 10:46 UTC by K Chandra Sekar
Modified: 2023-12-15 16:27 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the Cluster Monitoring Operator playbook resets the CMO ConfigMap every time it's executed. Consequence: manual changes to the ConfigMap enabling the etcd monitoring are lost. Fix: etcd monitoring can be configured with Ansible. Result: etcd monitoring is persisted when the CMO playbook is executed again.
Clone Of:
Environment:
Last Closed: 2020-03-20 00:12:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
etcd is down after upgrade (47.07 KB, image/png)
2020-02-10 16:12 UTC, Junqi Zhao
no flags Details
etcd is still up after upgrade (27.24 KB, image/png)
2020-02-27 13:51 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12101 0 None closed Fix cluster monitoring operator config map with etcd 2020-12-03 10:14:02 UTC
Github openshift openshift-docs pull 19286 0 None closed Update etcd monitoring procedure 2020-12-03 10:14:27 UTC
Red Hat Knowledge Base (Solution) 4608021 0 None None None 2019-11-25 16:19:30 UTC
Red Hat Product Errata RHBA-2020:0793 0 None None None 2020-03-20 00:12:54 UTC

Description K Chandra Sekar 2019-04-25 10:46:29 UTC
Description of problem:

etcd monitoring configuration is completely getting reset when performing an minor upgrade from v3.11.88 to v3.11.98.
Followed the guide[1] to setup etcd monitoring will not come by default when OpenShift Monitoring Stack is set up.So after setting up the etcd monitoring successfully when we upgarde the cluster to a minor version whole etcd monitoring setup is getting disappeared and it reverts to the default OpenShift Monitoring Stack as a result it shows all the etcd targets are down.Minor upgrades shouldn't be doing this as etcd is major component which requires continues monitoring 


How reproducible: Always


Steps to Reproduce:
1.Set up a OpenShift Monitoring Stack on OpenShift v3.11
2.Next setup etcd monitoring as stated from the guide[1]
3.Just upgrade the whole cluster to minor version and boom OpenShift Monitoring stack is reverted back to its original state and etcd config goes missing.

Actual results:

Whole etcd monitoring setup is getting disappeared and it reverts to the default OpenShift Monitoring Stack after a minor cluster update as a result it shows all the etcd targets are down.


Expected results:

Minor upgrades shouldn't be doing this as etcd is major component which requires continues monitoring.So minor cluster upgrades should still persist the configuration moving onto to the next version as well unless there are major breaking changes involved.

Additional info:
[1]- https://docs.openshift.com/container-platform/3.11/install_config/prometheus_cluster_monitoring.html#configuring-etcd-monitoring

Comment 1 Frederic Branczyk 2019-04-25 11:58:19 UTC
Yes I can see how this happens, this is indeed a bug. As a work around for now, you can reapply the configuration without an issue and you should get back into the expected state. Of course that's not how it should be, but a way to move forward for the customer in the immediate situation until we fix this. This needs a fix in the OpenShift ansible playbooks.

Comment 10 Simon Pasquier 2019-11-18 11:23:47 UTC
*** Bug 1772948 has been marked as a duplicate of this bug. ***

Comment 11 Simon Pasquier 2019-11-18 11:25:10 UTC
*** Bug 1772729 has been marked as a duplicate of this bug. ***

Comment 12 Simon Pasquier 2020-01-13 09:20:17 UTC
*** Bug 1748871 has been marked as a duplicate of this bug. ***

Comment 16 Junqi Zhao 2020-02-10 16:12:47 UTC
Created attachment 1662196 [details]
etcd is down after upgrade

Comment 28 Junqi Zhao 2020-02-27 13:51:43 UTC
Created attachment 1666211 [details]
etcd is still up after upgrade

Comment 31 errata-xmlrpc 2020-03-20 00:12:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0793

Comment 32 Pawel Krupa 2020-05-28 10:35:06 UTC
*** Bug 1839179 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.