Bug 2040694
| Summary: | Three upstream HTTPClientConfig struct fields missing in the operator | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Lucas López Montero <llopezmo> | |
| Component: | Monitoring | Assignee: | Philip Gough <pgough> | |
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 4.9 | CC: | amuller, anpicker, aos-bugs, benjamin.alpert, erooth | |
| Target Milestone: | --- | |||
| Target Release: | 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2041459 (view as bug list) | Environment: | ||
| Last Closed: | 2022-03-10 16:39:37 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2041459 | |||
|
Description
Lucas López Montero
2022-01-14 14:28:54 UTC
This issue should be resolved in 4.10 via https://github.com/prometheus-operator/prometheus-operator/pull/4333/ tested with 4.10.0-0.nightly-2022-01-17-182202 and followed steps in https://docs.openshift.com/container-platform/4.9/monitoring/managing-alerts.html#applying-custom-alertmanager-configuration_managing-alerts output the current Alertmanager configuration into file alertmanager.yaml, and edit to include http_config.follow_redirects 1. oc -n openshift-monitoring get secret alertmanager-main --template='{{ index .data "alertmanager.yaml" }}' | base64 --decode > alertmanager.yaml 2. edit alertmanager.yaml, include http_config.follow_redirects **************************** global: resolve_timeout: 5m http_config: follow_redirects: "false" inhibit_rules: - equal: - namespace - alertname source_matchers: - severity = critical target_matchers: - severity =~ warning|info - equal: - namespace - alertname source_matchers: - severity = warning target_matchers: - severity = info receivers: - name: Default - name: Watchdog - name: Critical - name: webhook webhook_configs: - send_resolved: "true" http_config: follow_redirects: "true" url: http://gems-agent.gemcloud-system:8041/alert route: group_by: - namespace group_interval: 5m group_wait: 30s receiver: Default repeat_interval: 12h routes: - matchers: - alertname = Watchdog receiver: Watchdog - matchers: - severity = critical receiver: Critical - receiver: webhook match: severity: critical **************************** 3. Apply the new configuration, no error for that # oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run=client -o=yaml | oc -n openshift-monitoring replace secret --filename=- secret/alertmanager-main replaced 4. the new config file loaded, # oc -n openshift-monitoring exec -c alertmanager alertmanager-main-0 -- cat /etc/alertmanager/config/alertmanager.yaml global: resolve_timeout: 5m http_config: follow_redirects: "false" inhibit_rules: - equal: - namespace - alertname source_matchers: - severity = critical target_matchers: - severity =~ warning|info - equal: - namespace - alertname source_matchers: - severity = warning target_matchers: - severity = info receivers: - name: Default - name: Watchdog - name: Critical - name: webhook webhook_configs: - send_resolved: "true" http_config: follow_redirects: "true" url: http://gems-agent.gemcloud-system:8041/alert route: group_by: - namespace group_interval: 5m group_wait: 30s receiver: Default repeat_interval: 12h routes: - matchers: - alertname = Watchdog receiver: Watchdog - matchers: - severity = critical receiver: Critical - receiver: webhook match: severity: critical **************************** also the same result from # oc -n openshift-monitoring get secret alertmanager-main -o jsonpath="{.data.alertmanager\.yaml}" | base64 -d but there is error in alertmanager # oc -n openshift-monitoring logs -c alertmanager alertmanager-main-0 | tail level=info ts=2022-01-18T13:16:46.567Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml level=error ts=2022-01-18T13:16:46.567Z caller=coordinator.go:118 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="yaml: unmarshal errors:\n line 4: cannot unmarshal !!str `false` into bool\n line 26: cannot unmarshal !!str `true` into bool\n line 28: cannot unmarshal !!str `true` into bool" level=info ts=2022-01-18T13:16:51.567Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml level=error ts=2022-01-18T13:16:51.567Z caller=coordinator.go:118 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="yaml: unmarshal errors:\n line 4: cannot unmarshal !!str `false` into bool\n line 26: cannot unmarshal !!str `true` into bool\n line 28: cannot unmarshal !!str `true` into bool" level=info ts=2022-01-18T13:16:56.567Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml level=error ts=2022-01-18T13:16:56.567Z caller=coordinator.go:118 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="yaml: unmarshal errors:\n line 4: cannot unmarshal !!str `false` into bool\n line 26: cannot unmarshal !!str `true` into bool\n line 28: cannot unmarshal !!str `true` into bool" level=info ts=2022-01-18T13:17:01.568Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml level=error ts=2022-01-18T13:17:01.568Z caller=coordinator.go:118 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="yaml: unmarshal errors:\n line 4: cannot unmarshal !!str `false` into bool\n line 26: cannot unmarshal !!str `true` into bool\n line 28: cannot unmarshal !!str `true` into bool" level=info ts=2022-01-18T13:17:06.567Z caller=coordinator.go:113 component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml level=error ts=2022-01-18T13:17:06.567Z caller=coordinator.go:118 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="yaml: unmarshal errors:\n line 4: cannot unmarshal !!str `false` into bool\n line 26: cannot unmarshal !!str `true` into bool\n line 28: cannot unmarshal !!str `true` into bool" continue with Comment 4, did not see the change in the ${alertmanager_route}/#/status, it remains the default configuration I think the issue here is that you are evaluating bool as strings. I checked the following with amtool and it validates:
global:
resolve_timeout: 5m
http_config:
follow_redirects: false
inhibit_rules:
- equal:
- namespace
- alertname
source_matchers:
- severity = critical
target_matchers:
- severity =~ warning|info
- equal:
- namespace
- alertname
source_matchers:
- severity = warning
target_matchers:
- severity = info
receivers:
- name: Default
- name: Watchdog
- name: Critical
- name: webhook
webhook_configs:
- send_resolved: true
http_config:
follow_redirects: true
url: http://gems-agent.gemcloud-system:8041/alert
route:
group_by:
- namespace
group_interval: 5m
group_wait: 30s
receiver: Default
repeat_interval: 12h
routes:
- matchers:
- alertname = Watchdog
receiver: Watchdog
- matchers:
- severity = critical
receiver: Critical
- receiver: webhook
match:
severity: critical
updated as Comment 6, no error in alertmanager container, and we could see the configuration loaded in ${alertmanager_route}/#/status page Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |