Bug 1905761
Summary: | NetworkPolicy with Egress policyType is resulting in SDN errors and improper communication within Project | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Robert Bost <rbost> | ||||||
Component: | Networking | Assignee: | Victor Pickard <vpickard> | ||||||
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | high | ||||||||
Priority: | high | CC: | aconstan, aivaras.laimikis, anbhat, anusaxen, danw, travi, vpickard, zzhao | ||||||
Version: | 4.6 | Keywords: | Reopened | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 4.7.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1969993 (view as bug list) | Environment: | |||||||
Last Closed: | 2021-02-24 15:41:14 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1969993, 1970046 | ||||||||
Attachments: |
|
Description
Robert Bost
2020-12-09 01:39:47 UTC
> Additionally, the SDN logs is showing this repeated which I believe is related:
> I1209 01:24:01.738089 4106341 pod.go:508] CNI_ADD hsts/web-0 got IP 10.131.1.97, ofport 1875
To clarify the logs vs oc get pods output in my last comment, the logs always show the IP Address of the Pod. The log I shared above was just saved to a clipboard.
Created attachment 1737777 [details]
YAML containing reproducer project
I have been able to reproduce the error logs with the attached YAML, and have posted a PR. Even with the error logs, I did not see a connectivity issue between any of the pods in my local setup with 4.6.4. I consistently got a 404 error (expected) when doing a curl from web-0 pod to the other pods. It's possible, that the error caused by attempting to program a flow with an empty IP address may have prevented some other flows from being installed under certain conditions. I'll verify with some other team members. > I did not see a connectivity issue between any of the pods in my local setup with 4.6.4.
Apologies for the late reply, but please check the Pods are running on different nodes when you attempt to reproduce problem.
checked this issue on 4.7.0-0.nightly-2021-01-07-181010 the `invalid IP address` logs not found now. but when create the networkpolicy with type is egress. then pod cannot be accessed each other. steps: 1. create namespace z3 and create test pods oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES test-rc-7bx7s 1/1 Running 0 15m 10.129.3.3 zzhao108-zk7kr-compute-1 <none> <none> test-rc-pktzp 1/1 Running 0 15m 10.129.3.2 zzhao108-zk7kr-compute-1 <none> <none> 2. access pod from one to another, both two can work can return 'Hello OpenShift!" $ oc exec test-rc-7bx7s -- curl 10.129.3.3:8080 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 17 100 17 0 0 17000 0 --:--:-- --:--:-- --:--:-- 17000 Hello OpenShift! $ oc exec test-rc-7bx7s -- curl 10.129.3.2:8080 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Hello OpenShift! 100 17 100 17 0 0 17000 0 --:--:-- --:--:-- --:--:-- 17000 3. Create the networkpolicy with egress type with not match any pods apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: bad-np spec: egress: - {} podSelector: matchLabels: never-gonna: match policyTypes: - Egress 4. access again. this time pod1 cannot access pod2 $ oc exec test-rc-7bx7s -- curl --connect-timeout 4 10.129.3.2:8080 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0 curl: (28) Connection timed out after 4001 milliseconds command terminated with exit code 28 $ oc get netnamespace z3 NAME NETID EGRESS IPS z3 4671645 Created attachment 1745586 [details]
sdn ovs openflow
Hi Zhanqi, Can you please attach the yaml files that you used to recreate this? I really like the "Hello Openshift" server you have. Thanks > Additionally, the SDN logs is showing this repeated which I believe is related: > > I1209 01:24:01.763042 4106341 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: -:2: 0/0: invalid IP address yes, there was a bug recently introduced in the NetworkPolicy code. None of the other specifics here are relevant; the buggy NetworkPolicy code would result in bad rules regardless of what you were doing. *** This bug has been marked as a duplicate of bug 1914284 *** sorry, no there's another bug here Verified this bug on 4.7.0-0.nightly-2021-01-14-211319 @zzhao has FailedQA flag set. Perhaps you could set(clear) this flag when you get a chance? Thanks! ah, I clear the FailedQA flag. thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |