Bug 2094068 - No runbook created for NorthboundStale alert
Summary: No runbook created for NorthboundStale alert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.12.0
Assignee: Martin Kennelly
QA Contact: Weibin Liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-06 18:05 UTC by Weibin Liang
Modified: 2023-01-17 19:50 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-17 19:49:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1482 0 None open Bug 2094068: Add northboundstale alert runbook 2022-06-10 12:20:18 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:50:00 UTC

Description Weibin Liang 2022-06-06 18:05:09 UTC
Description of problem:
https://issues.redhat.com/browse/SDN-2736: "Create runbook and link SOP for NorthboundStale alert"

There is no runbook created for NorthboundStale alert in latest v4.11 build.

Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-06-04-014713

How reproducible:
Always

Steps to Reproduce:
No runbook created for NorthboundStale alert
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json
                    {
                        "alert": "NorthboundStale",
                        "annotations": {
                            "description": "Networking control plane is degraded. Networking configuration updates applied to the cluster will not be\nimplemented. Existing workloads should continue to have connectivity. OVN-Kubernetes control plane and/or\nOVN northbound database may not be functional.\n",
                            "summary": "ovn-kubernetes has not written anything to the northbound database for too long."
                        },
                        "expr": "time() - max(ovnkube_master_nb_e2e_timestamp) \u003e 120\n",
                        "for": "10m",
                        "labels": {
                            "severity": "warning"
                        }
                    },
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[3].annotations.runbook_url}

#### Compare NoOvnMasterLeader alert
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[2].annotations.runbook_url}
https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md

[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json
                    {
                        "alert": "NoOvnMasterLeader",
                        "annotations": {
                            "description": "Networking control plane is degraded. Networking configuration updates applied to the cluster will not be\nimplemented while there is no OVN Kubernetes leader. Existing workloads should continue to have connectivity.\nOVN-Kubernetes control plane is not functional.\n",
                            "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md",
                            "summary": "There is no ovn-kubernetes master leader."
                        },
                        "expr": "max(ovnkube_master_leader) == 0\n",
                        "for": "10m",
                        "labels": {
                            "severity": "critical"
                        }
                    },

Actual results:
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[3].annotations.runbook_url}

Nothing return

Expected results:
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[3].annotations.runbook_url}
https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NorthboundStale.md

Additional info:

Comment 2 Weibin Liang 2022-08-31 13:51:43 UTC
Test passed in 4.12.0-0.nightly-2022-08-31-064023

Comment 5 errata-xmlrpc 2023-01-17 19:49:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.