Bug 1935582
Summary: | prometheus liveness probes cause issues while replaying WAL | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Sergiusz Urbaniak <surbania> | |
Component: | Monitoring | Assignee: | Sergiusz Urbaniak <surbania> | |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.6 | CC: | alegrand, anpicker, erooth, kahara, kakkoyun, lcosic, mas-hatada, mfuruta, pkrupa, rh-container, spasquie | |
Target Milestone: | --- | |||
Target Release: | 4.8.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1935585 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-27 22:51:38 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1935585 |
Description
Sergiusz Urbaniak
2021-03-05 08:24:05 UTC
4.8 uses prometheus-operator 0.45 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-03-06-055252 True False 148m Cluster version is 4.8.0-0.nightly-2021-03-06-055252 # oc -n openshift-monitoring logs prometheus-operator-684947f46c-28gxl -c prometheus-operator level=info ts=2021-03-07T23:43:09.721849275Z caller=main.go:233 msg="Starting Prometheus Operator" version="(version=0.45.0, branch=rhaos-4.8-rhel-8, revision=9d3e9a6)" Dear Red Hat, Does Red Hat have a plan to backport this fix to old versions? Our customer got the same issue in OCP4.5. 4.8 is still not GA version so they cannot upgraded to it now. How can we avoid this issue with old versions? Currently only one prometheus instance is running on the customer's env since other one has got restarted repeatedly due to this issue. If the same issue happened in both instances, cluster-monitoring becomes completely unavailable. It is very critical. Best Regards, Masaki Hatada The bug fix has been backported to 4.6.22 (bug 1935586) and 4.7.2 (bug 1935585). Dear Simon,
Thank you for your update.
> The bug fix has been backported to 4.6.22 (bug 1935586) and 4.7.2 (bug 1935585).
Our customer is using OCP4.5.
Of course, we will upgrade their cluster in future but it will take a time.
Please let us know if there is some workaround of this issue.
Best Regards,
Masaki Hatada
Unfortunately we have no workaround for 4.5. clearing needinfo flag. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |