Bug 1734029

Summary: Prometheus statefulset is reconciled on pod deletion but not when is scaled down
Product: OpenShift Container Platform Reporter: Robert Sandu <rsandu>
Component: MonitoringAssignee: Sergiusz Urbaniak <surbania>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: alegrand, anpicker, aos-bugs, cvogel, erooth, jokerman, mloibl, pkrupa, surbania
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:04:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Sandu 2019-07-29 13:27:33 UTC
Description of problem: Prometheus statefulset is reconciled on pod deletion but not on scale down.

Version-Release number of selected component (if applicable): multiple. I've tested this on 4 clusters with the following versions:

4.1.7
3.11.117
3.11.98
3.11.69

How reproducible: always.

Steps to Reproduce:
1. oc scale statefulset.apps/prometheus-k8s --replicas=1
2. The second sts replica is not created again. Yet the prometheus-k8s-1 pod is deleted, a new one is created instead.

Actual results: the second sts replica is not created again when the sts is scaled down.

Expected results: the statefulset to be reconciled both on pod deletion of statefulset scale down.

Additional info:

- Nothing suspicious in the prom operator logs
- Prometheus operator is expecting 2 replicas, but only one sts is up:

$ oc -n openshift-monitoring get prometheus k8s -o yaml | grep -i replicas
  replicas: 2
$ oc -n openshift-monitoring describe sts prometheus-k8s | grep -i replicas
Replicas:           1 desired | 1 total

Comment 14 errata-xmlrpc 2020-01-23 11:04:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062