Bug 1950173
Summary: | Non-fatal: prometheus.env.yaml: no such file or directory | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | W. Trevor King <wking> |
Component: | Monitoring | Assignee: | Filip Petkovski <fpetkovs> |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.6 | CC: | anpicker, erooth, lcosic, spasquie |
Target Milestone: | --- | Keywords: | Upgrades |
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-10-18 17:30:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
W. Trevor King
2021-04-15 23:31:54 UTC
it seems it is the same bug as bug 1777216 Yeah, looks likely. I'm not super-excited about "exit 1 to get a fresh pass at loading a config file"; it seems like it would be easy enough to have the current process reload the file. But if that's the intended behavior, it seems like we should at least log a "prometheus.env.yaml created; exiting so the replacement container will load the new config" line or something so we don't have lots of folks wondering if this is a bug or not. There is an upstream PR for this issue which has been reviewed and should be merged this week: https://github.com/prometheus-operator/prometheus-operator/pull/3955 This issue would therefore get fixed once we bump the version of prometheus-operator in CMO The upstream PR has been merged and should be included in v0.49.0 (planned for end of June). FYI: same issue for the prometheus-user-workload pods # oc -n openshift-user-workload-monitoring describe pod prometheus-user-workload-0 ... Containers: prometheus: Container ID: cri-o://579bd8fa35f14ef615d32f1ab1ea01e7c026b19ae968d2cf1721190dc24712f4 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5dec081bc9e08360e810eac16b662b962e95bd98c163ff5f13059790bf9dfe10 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5dec081bc9e08360e810eac16b662b962e95bd98c163ff5f13059790bf9dfe10 Port: <none> Host Port: <none> Args: --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries --config.file=/etc/prometheus/config_out/prometheus.env.yaml --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=24h --web.enable-lifecycle --storage.tsdb.no-lockfile --web.route-prefix=/ --web.listen-address=127.0.0.1:9090 State: Running Started: Wed, 16 Jun 2021 02:35:33 -0400 Last State: Terminated Reason: Error Message: level=error ts=2021-06-16T06:35:31.993Z caller=main.go:347 msg="Error loading config (--config.file=/etc/prometheus/config_out/prometheus.env.yaml)" err="open /etc/prometheus/config_out/prometheus.env.yaml: no such file or directory" Exit Code: 2 issue is fixed with 4.9.0-0.nightly-2021-07-18-155939 $ oc -n openshift-monitoring get pod | grep prometheus-k8s prometheus-k8s-0 7/7 Running 0 176m prometheus-k8s-1 7/7 Running 0 176m $ oc -n openshift-user-workload-monitoring get pod | grep prometheus-user-workload prometheus-user-workload-0 5/5 Running 0 8m19s prometheus-user-workload-1 5/5 Running 0 8m19s $ oc -n openshift-monitoring logs $(oc -n openshift-monitoring get po | grep prometheus-operator | awk '{print $1}') -c prometheus-operator | head -n 2 level=info ts=2021-07-18T23:45:16.567948925Z caller=main.go:295 msg="Starting Prometheus Operator" version="(version=0.49.0, branch=rhaos-4.9-rhel-8, revision=c878cd4)" level=info ts=2021-07-18T23:45:16.56799377Z caller=main.go:296 build_context="(go=go1.16.4, user=root, date=20210709-06:10:25)" Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |