Created attachment 1685701 [details] openshift-sdn pods' log Description of problem: fresh cluster, found ClusterIPTablesStale/NodeIPTablesStale alerts triggered ALERTS{alertname=~"ClusterIPTablesStale|NodeIPTablesStale"} Element Value ALERTS{alertname="ClusterIPTablesStale",alertstate="firing",severity="warning"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.6",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-1",pod="sdn-7hbq5",pod_ip="10.0.0.6",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="c23d7c15-bcd0-4d3c-a94b-d9c8b5acc971"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.7",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-2",pod="sdn-62xcj",pod_ip="10.0.0.7",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="a334b395-3d09-4c9e-8817-c896d9ba25e7"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.8",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-0",pod="sdn-lst42",pod_ip="10.0.0.8",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="d02c2a73-22e3-4824-8acf-f347eeb30813"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.10",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-rhelxy-0",pod="sdn-65n48",pod_ip="10.0.1.10",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="dfbf961d-9bd0-40e7-9502-cfdc2997be41"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.4",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-1",pod="sdn-7qh4n",pod_ip="10.0.1.4",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="7e4eadab-f08e-492c-a3b0-f8712f6f4fac"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.5",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-3",pod="sdn-zlnxd",pod_ip="10.0.1.5",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="450d8477-badb-45ed-a981-bd03c175ad8a"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.6",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-2",pod="sdn-zkn9s",pod_ip="10.0.1.6",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="a1844950-5b90-40f5-8803-9014b357a061"} 1 ALERTS{alertname="NodeIPTablesStale",alertstate="firing",created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.9",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-rhelxy-1",pod="sdn-pnphs",pod_ip="10.0.1.9",priority_class="system-node-critical",service="kube-state-metrics",severity="warning",uid="cdae257b-a2cd-4758-ae68-316ce66477fc"} 1 alert details ******************** alert: NodeIPTablesStale expr: (timestamp(kubeproxy_sync_proxy_rules_last_timestamp_seconds) - on(pod) kubeproxy_sync_proxy_rules_last_timestamp_seconds) * on(pod) group_right() kube_pod_info{namespace="openshift-sdn",pod=~"sdn-[^-]*"} > 120 for: 20m labels: severity: warning annotations: message: SDN pod {{ $labels.pod }} on node {{ $labels.node }} has gone too long without syncing iptables rules. alert: ClusterIPTablesStale expr: quantile(0.95, timestamp(kubeproxy_sync_proxy_rules_last_timestamp_seconds) - on(pod) kubeproxy_sync_proxy_rules_last_timestamp_seconds * on(pod) group_right() kube_pod_info{namespace="openshift-sdn",pod=~"sdn-[^-]*"}) > 90 for: 20m labels: severity: warning annotations: message: The average time between iptables resyncs is too high. NOTE - There is some scrape delay and other offsets, 90s isn't exact but it is still too high. ******************** query the expr in prometheus NodeIPTablesStale expr: (timestamp(kubeproxy_sync_proxy_rules_last_timestamp_seconds) - on(pod) kubeproxy_sync_proxy_rules_last_timestamp_seconds) * on(pod) group_right() kube_pod_info{namespace="openshift-sdn",pod=~"sdn-[^-]*"} > 120 result: Element Value {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.6",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-1",pod="sdn-7hbq5",pod_ip="10.0.0.6",priority_class="system-node-critical",service="kube-state-metrics",uid="c23d7c15-bcd0-4d3c-a94b-d9c8b5acc971"} 2278.20618224144 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.7",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-2",pod="sdn-62xcj",pod_ip="10.0.0.7",priority_class="system-node-critical",service="kube-state-metrics",uid="a334b395-3d09-4c9e-8817-c896d9ba25e7"} 2282.0367472171783 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.0.8",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-master-0",pod="sdn-lst42",pod_ip="10.0.0.8",priority_class="system-node-critical",service="kube-state-metrics",uid="d02c2a73-22e3-4824-8acf-f347eeb30813"} 2291.8284714221954 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.10",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-rhelxy-0",pod="sdn-65n48",pod_ip="10.0.1.10",priority_class="system-node-critical",service="kube-state-metrics",uid="dfbf961d-9bd0-40e7-9502-cfdc2997be41"} 2271.0348105430603 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.4",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-1",pod="sdn-7qh4n",pod_ip="10.0.1.4",priority_class="system-node-critical",service="kube-state-metrics",uid="7e4eadab-f08e-492c-a3b0-f8712f6f4fac"} 2287.8972160816193 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.5",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-3",pod="sdn-zlnxd",pod_ip="10.0.1.5",priority_class="system-node-critical",service="kube-state-metrics",uid="450d8477-badb-45ed-a981-bd03c175ad8a"} 2274.371784210205 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.6",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-worker-centralus-2",pod="sdn-zkn9s",pod_ip="10.0.1.6",priority_class="system-node-critical",service="kube-state-metrics",uid="a1844950-5b90-40f5-8803-9014b357a061"} 2283.6773200035095 {created_by_kind="DaemonSet",created_by_name="sdn",endpoint="https-main",host_ip="10.0.1.9",instance="10.129.2.6:8443",job="kube-state-metrics",namespace="openshift-sdn",node="yinzhou-share-05060154-rhelxy-1",pod="sdn-pnphs",pod_ip="10.0.1.9",priority_class="system-node-critical",service="kube-state-metrics",uid="cdae257b-a2cd-4758-ae68-316ce66477fc"} 2282.943865060 ClusterIPTablesStale expr: quantile(0.95, timestamp(kubeproxy_sync_proxy_rules_last_timestamp_seconds) - on(pod) kubeproxy_sync_proxy_rules_last_timestamp_seconds * on(pod) group_right() kube_pod_info{namespace="openshift-sdn",pod=~"sdn-[^-]*"}) > 90 result: Element Value {} 1697.8125918507576 Version-Release number of selected component (if applicable): UPI on Azure 4.5.0-0.nightly-2020-05-05-205255 cluster How reproducible: always Steps to Reproduce: 1. check prometheus alerts in prometheus 2. 3. Actual results: ClusterIPTablesStale/NodeIPTablesStale alerts triggered Expected results: no such alerts Additional info:
*** This bug has been marked as a duplicate of bug 1826339 ***