Description of problem: https://issues.redhat.com/browse/SDN-2739: Create runbook and link SOP for SouthboundStale alert There is no runbook created for SouthboundStale alert in latest v4.11 build. Version-Release number of selected component (if applicable): 4.11.0-0.nightly-2022-06-04-014713 How reproducible: Always Steps to Reproduce: No runbook created for SouthboundStale alert [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json { "alert": "SouthboundStale", "annotations": { "description": "Networking control plane is degraded. Networking configuration updates may not be applied to the cluster or\ntaking a long time to apply. This usually means there is a large load on OVN component 'northd' or it is not\nfunctioning.\n", "summary": "ovn-northd has not successfully synced any changes to the southbound DB for too long." }, "expr": "max(ovnkube_master_nb_e2e_timestamp) - max(ovnkube_master_sb_e2e_timestamp) \u003e 120\n", "for": "10m", "labels": { "severity": "warning" } }, [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[4].annotations.runbook_url} #### Compare NoOvnMasterLeader alert [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[2].annotations.runbook_url} https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o json { "alert": "NoOvnMasterLeader", "annotations": { "description": "Networking control plane is degraded. Networking configuration updates applied to the cluster will not be\nimplemented while there is no OVN Kubernetes leader. Existing workloads should continue to have connectivity.\nOVN-Kubernetes control plane is not functional.\n", "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoOvnMasterLeader.md", "summary": "There is no ovn-kubernetes master leader." }, "expr": "max(ovnkube_master_leader) == 0\n", "for": "10m", "labels": { "severity": "critical" } }, Actual results: [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[4].annotations.runbook_url} Nothing return Expected results: [weliang@weliang ~]$ oc -n openshift-ovn-kubernetes get PrometheusRule master-rules -o jsonpath={.spec.groups[0].rules[3].annotations.runbook_url} https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/SouthboundStale.md Additional info:
Tested and passed in 4.11.0-0.nightly-2022-06-21-040754
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069