Bug 1848478 - [BUG] Possible regression on checking hostsubnet egress CIDR definition errors
Summary: [BUG] Possible regression on checking hostsubnet egress CIDR definition errors
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 4.6.0
Assignee: Surya Seetharaman
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks: 1878624
TreeView+ depends on / blocked
 
Reported: 2020-06-18 12:42 UTC by Andre Costa
Modified: 2020-09-14 08:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift sdn pull 169 None closed Bug 1848478: Invalid egressCIDR value causes sdn pods to fail on startup 2020-09-23 08:43:50 UTC

Comment 3 Surya Seetharaman 2020-07-29 11:08:16 UTC
Hi Andre,

I have a patch that will fix the problem of sdn pods failing to start back up due to the invalid egressCIDR value. In OCP4.x since we use watchers and informers plus kubebuilder validation in the API unlike in OCP3.11, unfortunately we will not outrightly invalidate the incorrect value. Instead it gets silently ignored by emitting a warning in the logs. But the patch I've posted will ensure that the incorrect egressCIDR field is wiped clean if its invalid. This way the user can know if an incorrect value was specified because doing oc get hostsubnet will still not have the egressCIDR values set.

Comment 9 Surya Seetharaman 2020-08-24 09:25:55 UTC
Hi Andre,

I have an update on the logic. Based on the reviews I received on the patch: because of the way the sdn interacts with the api (that is the source of truth), sdn cannot modify fields in the api. In general sdn does not undo the user's changes in the api, even when they're wrong. So I won't be able to clear the fields, they would get logged as invalid values on the pod logs, so user should be able to pick this up from the logs if something goes wrong.

However the sdn pods failing to start because of the invalid values is just wrong. This will be fixed in the patch since it shouldn't fail to come up due to invalid user-maintained values. Will try to get this backported to 4.5 and 4.4 as well, since we don't want to bring down the cluster.


Note You need to log in before you can comment on or make changes to this bug.