Bug 1778314 - False Alarm - SDN pod has gone too long without syncing iptables rules
Summary: False Alarm - SDN pod has gone too long without syncing iptables rules
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.2.z
Assignee: jtanenba
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On: 1797033 1797041
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-29 21:15 UTC by Hugo Cisneiros (Eitch)
Modified: 2020-05-18 13:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1797033 (view as bug list)
Environment:
Last Closed: 2020-02-24 16:52:45 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 416 None closed Bug 1778314: False Alarm - SDN pod has gone too long without syncing iptables rule 2020-07-08 00:56:18 UTC
Red Hat Product Errata RHBA-2020:0460 None None None 2020-02-24 16:52:59 UTC

Description Hugo Cisneiros (Eitch) 2019-11-29 21:15:13 UTC
Description of problem:

Cluster dashboard has these alarms:

* SDN pod sdn-9d4hc on node etcd-2.example.com has gone too long without syncing iptables rules. NOTE - There is some scrape delay and other offsets, 120s isn't exact but it is still too high.
* SDN pod sdn-sgrgk on node etcd-0.example.com has gone too long without syncing iptables rules. NOTE - There is some scrape delay and other offsets, 120s isn't exact but it is still too high.

While looking at the must-gather logs, I didn't see any problems on these pods sync processes:

2019-11-28T18:47:27.446545877Z I1128 18:47:27.446460   59750 proxy.go:331] hybrid proxy: syncProxyRules start
2019-11-28T18:47:27.597400981Z I1128 18:47:27.597347   59750 proxy.go:334] hybrid proxy: mainProxy.syncProxyRules complete
2019-11-28T18:47:27.653302796Z I1128 18:47:27.653260   59750 proxier.go:367] userspace proxy: processing 0 service events
2019-11-28T18:47:27.653409624Z I1128 18:47:27.653389   59750 proxier.go:346] userspace syncProxyRules took 55.912068ms
2019-11-28T18:47:27.653445515Z I1128 18:47:27.653436   59750 proxy.go:337] hybrid proxy: unidlingProxy.syncProxyRules complete

These are happening every 30 seconds, and the most time it took to complete was ~240ms. Very far from 120s.

Not sure why these are alarming.

Version-Release number of selected component (if applicable):

4.2.0

How reproducible:

* The alarms are on the web console dashboard;
* Log files at namespaces/openshift-sdn/sdn*/sdn/sdn/logs/current.log

Actual results:

Alarming in the dashboard.

Expected results:

No alarms.

Additional info:

Comment 4 Daniel Del Ciancio 2020-01-31 17:12:24 UTC
Need a status update - Is this targeted for 4.2.z or 4.3?  Or is a fix available?

Comment 10 errata-xmlrpc 2020-02-24 16:52:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0460


Note You need to log in before you can comment on or make changes to this bug.