Bug 2094071 - No runbook created for SouthboundStale alert
Summary: No runbook created for SouthboundStale alert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.11.0
Assignee: Martin Kennelly
QA Contact: Weibin Liang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-06 18:17 UTC by Weibin Liang
Modified: 2022-08-10 11:16 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Release Note
Doc Text:
Runbook created and attached to alert 'SouthboundStale' for OVN-Kubernetes CNI.
Clone Of:
Environment:
Last Closed: 2022-08-10 11:16:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 1481 0 None open Bug 2094071: Add southboundStale alert runbook 2022-06-10 12:19:38 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:16:36 UTC

Description Weibin Liang 2022-06-06 18:17:29 UTC
Description of problem:
https://issues.redhat.com/browse/SDN-2739: Create runbook and link SOP for SouthboundStale alert

There is no runbook created for SouthboundStale alert in latest v4.11 build. 

Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-06-04-014713

How reproducible:
Always

Steps to Reproduce:
No runbook created for SouthboundStale alert
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json
                   {
                        "alert": "SouthboundStale",
                        "annotations": {
                            "description": "Networking control plane is degraded. Networking configuration updates may not be applied to the cluster or\ntaking a long time to apply. This usually means there is a large load on OVN component 'northd' or it is not\nfunctioning.\n",
                            "summary": "ovn-northd has not successfully synced any changes to the southbound DB for too long."
                        },
                        "expr": "max(ovnkube_master_nb_e2e_timestamp) - max(ovnkube_master_sb_e2e_timestamp) \u003e 120\n",
                        "for": "10m",
                        "labels": {
                            "severity": "warning"
                        }
                    },
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[4].annotations.runbook_url}

#### Compare NoOvnMasterLeader alert
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[2].annotations.runbook_url}
https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md

[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json
                    {
                        "alert": "NoOvnMasterLeader",
                        "annotations": {
                            "description": "Networking control plane is degraded. Networking configuration updates applied to the cluster will not be\nimplemented while there is no OVN Kubernetes leader. Existing workloads should continue to have connectivity.\nOVN-Kubernetes control plane is not functional.\n",
                            "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md",
                            "summary": "There is no ovn-kubernetes master leader."
                        },
                        "expr": "max(ovnkube_master_leader) == 0\n",
                        "for": "10m",
                        "labels": {
                            "severity": "critical"
                        }
                    },
Actual results:
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[4].annotations.runbook_url}

Nothing return

Expected results:
[weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[3].annotations.runbook_url}
https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/SouthboundStale.md


Additional info:

Comment 3 Weibin Liang 2022-06-21 14:21:21 UTC
Tested and passed in 4.11.0-0.nightly-2022-06-21-040754

Comment 5 errata-xmlrpc 2022-08-10 11:16:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.