Bug 2024880
| Summary: | Egress IP breaks when network policies are applied | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mridul Markandey <mmarkand> |
| Component: | Networking | Assignee: | Ben Bennett <bbennett> |
| Networking sub component: | openshift-sdn | QA Contact: | huirwang |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | urgent | CC: | agabriel, alchan, anbhat, huirwang, jnordell, jwennerberg, lmohanty, nsu, pbertera, sdodson, shujadha, shzhou, vrutkovs, wking |
| Version: | 4.8 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-10 16:29:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2026302 | ||
|
Description
Mridul Markandey
2021-11-19 11:24:42 UTC
Verified in 4.10.0-0.nightly-2021-11-24-030137, EgressIP worked with the networkpolicy configured
$ oc get hostsubnet
NAME HOST HOST IP SUBNET EGRESS CIDRS EGRESS IPS
qe-huirwang1124b-x2z4m-master-0 qe-huirwang1124b-x2z4m-master-0 172.31.249.55 10.128.0.0/23
qe-huirwang1124b-x2z4m-master-1 qe-huirwang1124b-x2z4m-master-1 172.31.249.160 10.129.0.0/23
qe-huirwang1124b-x2z4m-master-2 qe-huirwang1124b-x2z4m-master-2 172.31.249.121 10.130.0.0/23
qe-huirwang1124b-x2z4m-worker-5rh7p qe-huirwang1124b-x2z4m-worker-5rh7p 172.31.249.32 10.128.2.0/23 ["172.31.249.0/24"] ["172.31.249.201"]
qe-huirwang1124b-x2z4m-worker-8x5c5 qe-huirwang1124b-x2z4m-worker-8x5c5 172.31.249.3 10.131.0.0/23
$ oc get netnamespace test
NAME NETID EGRESS IPS
test 3821487 ["172.31.249.201"]
$ oc get networkpolicy -n test -oyaml
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
creationTimestamp: "2021-11-24T08:47:06Z"
generation: 1
managedFields:
- apiVersion: networking.k8s.io/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:ingress: {}
f:policyTypes: {}
manager: kubectl-create
operation: Update
time: "2021-11-24T08:47:06Z"
name: test-podselector-and-ipblock
namespace: test
resourceVersion: "33366"
uid: b920fada-395d-4039-94cd-0d777bdc87dd
spec:
ingress:
- from:
- ipBlock:
cidr: 10.129.2.32/32
- ipBlock:
cidr: 10.131.0.0/24
- ipBlock:
cidr: 10.128.2.38/32
podSelector: {}
policyTypes:
- Ingress
kind: List
metadata:
resourceVersion: ""
selfLink: ""
$ oc rsh -n test test-rc-897bw
~ $ curl 172.31.249.80:9095
172.31.249.201~
$ curl www.google.com -I
HTTP/1.1 200 OK
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Date: Wed, 24 Nov 2021 08:50:54 GMT
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
Expires: Wed, 24 Nov 2021 08:50:54 GMT
Cache-Control: private
Set-Cookie: 1P_JAR=2021-11-24-08; expires=Fri, 24-Dec-2021 08:50:54 GMT; path=/; domain=.google.com; Secure
Set-Cookie: NID=511=ZchGIK5lR5eNtv-2BdT8K277sBGv9JhR9wDeAxdHpp77mT78NUzUJ6KGkt0kcwBIGZ5DX4TBqCFpPYOx0-DTX8O5_4zkDhYvzuMuhvinKeh7VV0SYsnj7oiB2bAaKrHIDsUEKAxNlJm0gUxxC8NlXsH__YoK2MUdktyR6Ob2ec8; expires=Thu, 26-May-2022 08:50:54 GMT; path=/; domain=.google.com; HttpOnly
We're asking the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the UpgradeBlocker flag has been added to this bug. It will be removed if the assessment indicates that this should not block upgrade edges. The expectation is that the assignee answers these questions. Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time What is the impact? Is it serious enough to warrant blocking edges? example: Up to 2 minute disruption in edge routing example: Up to 90seconds of API downtime example: etcd loses quorum and you have to restore from backup How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? example: Issue resolves itself after five minutes example: Admin uses oc to fix things example: Admin must SSH to hosts, restore from backups, or other non standard admin activities Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? example: No, it’s always been like this we just never noticed example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1 > Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? Customers using Egress IPs in namespaces with the network policies applied which do not explicitly allow access from the endpoints the pods in the namespaces are trying to connect to > What is the impact? Is it serious enough to warrant blocking edges? The egress IP matching pods will not have external connectivity unless the network policy is a removed or modified to explicitly allow connectivity from those endpoints > How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? The only remediation is removing the network policy or modifying it > Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? All 4.8.z and 4.9.z versions are impacted and customers upgrading from 4.7 to 4.8 will most likely hit this issue if the have this network policy configuration > All 4.8.z and 4.9.z versions are impacted and customers upgrading from 4.7 to 4.8 will most likely hit this issue if the have this network policy configuration
Because all 4.8.z releases have this issue and this took this long to surface it seems a small proportion of customers might face this issue. Also removing edges to all of 4.8.z will impact customers very negatively as we have edges present for so long. So we are not planning to block upgrade edges for this bug. However if the bug starts impacting more clusters we will reconsider blocking the edge.
Hi
I have tested in my OCP4.8.10, egressIP not work within below network policy
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: NAMESPACE
spec:
ingress:
- from:
- podSelector: {}
podSelector: {}
policyTypes:
- Ingress
```
Only after modify the networkPolicy to allow from the default namespace, the egressIP start to work, please refer to this https://bugzilla.redhat.com/show_bug.cgi?id=1700431 , looks like it not yet fix
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |