Bug 1999397
| Summary: | Prometheus: data race in the loadWAL function | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jan Fajerski <jfajersk> | |
| Component: | Monitoring | Assignee: | Jan Fajerski <jfajersk> | |
| Status: | CLOSED ERRATA | QA Contact: | hongyan li <hongyli> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 4.9 | CC: | amuller, anpicker, aos-bugs, arajkuma, erooth | |
| Target Milestone: | --- | |||
| Target Release: | 4.9.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1999580 (view as bug list) | Environment: | ||
| Last Closed: | 2021-10-18 17:49:59 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1999580 | |||
|
Description
Jan Fajerski
2021-08-31 06:01:48 UTC
*** Bug 1999580 has been marked as a duplicate of this bug. *** checked with 4.9.0-0.nightly-2021-08-31-123131, prometheus version is 2.29.2
# oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0
level=info ts=2021-09-01T01:16:01.399Z caller=main.go:445 msg="Starting Prometheus" version="(version=2.29.2, branch=rhaos-4.9-rhel-8, revision=99e16e81fcaee8ef609985f306aced9a465304ab)"
level=info ts=2021-09-01T01:16:01.399Z caller=main.go:450 build_context="(go=go1.16.6, user=root@e5c9e3ac803a, date=20210831-09:46:00)"
...
but the related resources version is still 2.29.1, example:
# oc -n openshift-monitoring get prometheus k8s -oyaml
...
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: openshift-monitoring
app.kubernetes.io/version: 2.29.1
prometheus: k8s
name: k8s
(In reply to Junqi Zhao from comment #6) > checked with 4.9.0-0.nightly-2021-08-31-123131, prometheus version is 2.29.2 > # oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 > level=info ts=2021-09-01T01:16:01.399Z caller=main.go:445 msg="Starting > Prometheus" version="(version=2.29.2, branch=rhaos-4.9-rhel-8, > revision=99e16e81fcaee8ef609985f306aced9a465304ab)" > level=info ts=2021-09-01T01:16:01.399Z caller=main.go:450 > build_context="(go=go1.16.6, user=root@e5c9e3ac803a, date=20210831-09:46:00)" > ... > > but the related resources version is still 2.29.1, example: > # oc -n openshift-monitoring get prometheus k8s -oyaml > ... > labels: > app.kubernetes.io/component: prometheus > app.kubernetes.io/name: prometheus > app.kubernetes.io/part-of: openshift-monitoring > app.kubernetes.io/version: 2.29.1 > prometheus: k8s > name: k8s Yes that is expected. The CMO asset sync is in PR https://github.com/openshift/cluster-monitoring-operator/pull/1353 and should fix this. It is now also linked with this bug. Test with payload 4.9.0-0.nightly-2021-09-05-192114 $ oc -n openshift-monitoring get prometheus k8s --show-labels NAME VERSION REPLICAS AGE LABELS k8s 2.29.2 2 96m app.kubernetes.io/component=prometheus,app.kubernetes.io/name=prometheus,app.kubernetes.io/part-of=openshift-monitoring,app.kubernetes.io/version=2.29.2,prometheus=k8s $ oc -n openshift-monitoring get pod --show-labels | grep app=prometheus prometheus-k8s-0 7/7 Running 0 84m ......app.kubernetes.io/version=2.29.2,app=prometheus,......prometheus=k8s,statefulset.kubernetes.io/pod-name=prometheus-k8s-0 prometheus-k8s-1 7/7 Running 0 84m ......app.kubernetes.io/version=2.29.2,app=prometheus,......prometheus=k8s,statefulset.kubernetes.io/pod-name=prometheus-k8s-1 $ oc get clusterrolebinding prometheus-k8s -n openshift-monitoring --show-labels NAME ROLE AGE LABELS prometheus-k8s ClusterRole/prometheus-k8s 89m app.kubernetes.io/component=prometheus,app.kubernetes.io/name=prometheus,app.kubernetes.io/part-of=openshift-monitoring,app.kubernetes.io/version=2.29.2 $ oc get clusterrole prometheus-k8s -n openshift-monitoring --show-labels NAME CREATED AT LABELS prometheus-k8s 2021-09-05T23:58:08Z app.kubernetes.io/component=prometheus,app.kubernetes.io/name=prometheus,app.kubernetes.io/part-of=openshift-monitoring,app.kubernetes.io/version=2.29.2 $ oc -n openshift-user-workload-monitoring get prometheus user-workload --show-labels NAME VERSION REPLICAS AGE LABELS user-workload 2.29.2 2 5m29s app.kubernetes.io/component=prometheus,app.kubernetes.io/name=prometheus,app.kubernetes.io/part-of=openshift-monitoring,app.kubernetes.io/version=2.29.2,prometheus=user-workload $ oc -n openshift-user-workload-monitoring logs prometheus-user-workload-0 level=info ts=2021-09-06T01:34:00.770Z caller=main.go:445 msg="Starting Prometheus" version="(version=2.29.2, branch=rhaos-4.9-rhel-8, revision=99e16e81fcaee8ef609985f306aced9a465304ab)" Complete Prometheus regression test, no issue Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |