Bug 1778314

Summary: False Alarm - SDN pod has gone too long without syncing iptables rules
Product: OpenShift Container Platform Reporter: Hugo Cisneiros (Eitch) <hcisneir>
Component: NetworkingAssignee: jtanenba
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: cdc, ddelcian, jtanenba
Version: 4.2.z   
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1797033 (view as bug list) Environment:
Last Closed: 2020-02-24 16:52:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1797033, 1797041    
Bug Blocks:    

Description Hugo Cisneiros (Eitch) 2019-11-29 21:15:13 UTC
Description of problem:

Cluster dashboard has these alarms:

* SDN pod sdn-9d4hc on node etcd-2.example.com has gone too long without syncing iptables rules. NOTE - There is some scrape delay and other offsets, 120s isn't exact but it is still too high.
* SDN pod sdn-sgrgk on node etcd-0.example.com has gone too long without syncing iptables rules. NOTE - There is some scrape delay and other offsets, 120s isn't exact but it is still too high.

While looking at the must-gather logs, I didn't see any problems on these pods sync processes:

2019-11-28T18:47:27.446545877Z I1128 18:47:27.446460   59750 proxy.go:331] hybrid proxy: syncProxyRules start
2019-11-28T18:47:27.597400981Z I1128 18:47:27.597347   59750 proxy.go:334] hybrid proxy: mainProxy.syncProxyRules complete
2019-11-28T18:47:27.653302796Z I1128 18:47:27.653260   59750 proxier.go:367] userspace proxy: processing 0 service events
2019-11-28T18:47:27.653409624Z I1128 18:47:27.653389   59750 proxier.go:346] userspace syncProxyRules took 55.912068ms
2019-11-28T18:47:27.653445515Z I1128 18:47:27.653436   59750 proxy.go:337] hybrid proxy: unidlingProxy.syncProxyRules complete

These are happening every 30 seconds, and the most time it took to complete was ~240ms. Very far from 120s.

Not sure why these are alarming.

Version-Release number of selected component (if applicable):

4.2.0

How reproducible:

* The alarms are on the web console dashboard;
* Log files at namespaces/openshift-sdn/sdn*/sdn/sdn/logs/current.log

Actual results:

Alarming in the dashboard.

Expected results:

No alarms.

Additional info:

Comment 4 Daniel Del Ciancio 2020-01-31 17:12:24 UTC
Need a status update - Is this targeted for 4.2.z or 4.3?  Or is a fix available?

Comment 10 errata-xmlrpc 2020-02-24 16:52:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0460