Bug 1848478

Summary: [BUG] Possible regression on checking hostsubnet egress CIDR definition errors
Product: OpenShift Container Platform Reporter: Andre Costa <andcosta>
Component: NetworkingAssignee: Surya Seetharaman <surya>
Networking sub component: openshift-sdn QA Contact: huirwang
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: bbennett, daniel.kucera, surya
Version: 4.4   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:08:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1878624    

Comment 3 Surya Seetharaman 2020-07-29 11:08:16 UTC
Hi Andre,

I have a patch that will fix the problem of sdn pods failing to start back up due to the invalid egressCIDR value. In OCP4.x since we use watchers and informers plus kubebuilder validation in the API unlike in OCP3.11, unfortunately we will not outrightly invalidate the incorrect value. Instead it gets silently ignored by emitting a warning in the logs. But the patch I've posted will ensure that the incorrect egressCIDR field is wiped clean if its invalid. This way the user can know if an incorrect value was specified because doing oc get hostsubnet will still not have the egressCIDR values set.

Comment 9 Surya Seetharaman 2020-08-24 09:25:55 UTC
Hi Andre,

I have an update on the logic. Based on the reviews I received on the patch: because of the way the sdn interacts with the api (that is the source of truth), sdn cannot modify fields in the api. In general sdn does not undo the user's changes in the api, even when they're wrong. So I won't be able to clear the fields, they would get logged as invalid values on the pod logs, so user should be able to pick this up from the logs if something goes wrong.

However the sdn pods failing to start because of the invalid values is just wrong. This will be fixed in the patch since it shouldn't fail to come up due to invalid user-maintained values. Will try to get this backported to 4.5 and 4.4 as well, since we don't want to bring down the cluster.

Comment 15 errata-xmlrpc 2020-10-27 16:08:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196