1645577 – EgressIP not sending traffic to the hostsubnet

Bug 1645577 - EgressIP not sending traffic to the hostsubnet

Summary: EgressIP not sending traffic to the hostsubnet

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.9.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.9.z
Assignee:	Casey Callendrello
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-11-02 14:56 UTC by Juan Luis de Sousa-Valadas
Modified:	2022-03-13 15:56 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-03-26 18:54:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Juan Luis de Sousa-Valadas 2018-11-02 14:56:06 UTC

Description of problem:


Version-Release number of selected component (if applicable):
3.9.40-1

How reproducible:
Frequent but intermittently.

Steps to Reproduce:
1. Create a netnamespace and hostsubnet with egressIP 
2. Create a pod within that netnamespace
3. Connect somewhere

Actual results:
Sometimes it gets redirected, sometimes it does not.
 
$ oc get netnamespace xxxx
xxxx                         3625000    [xxx.yy.zz.195]
[ans@czcholspc004001 ~]$ oc get hostsubnet
NAME             HOST           HOST IP SUBNET          EGRESS IPS
--snip
<host fqdn>      <host fqdn>           xxx.yy.zz.134   2.176.25.0/24   [xxx.yy.zz.195]

xxx.yy.zz.134 and xxx.yy.zz.195 are in the same /24 network. xxx.yy.zz.195 is unique and answers adequately.

There is an existing flow for this rule in the node hosting the pod;:
 cookie=0x0, duration=4940910.963s, table=100, n_packets=1059147, n_bytes=102466576, priority=200,udp,nw_dst=xxx.yy.zz.134,tp_dst=53 actions=output:2

However when reviewing the traffic in the eth0 NIC we see the traffic is not redirected.

$ grep 375028 56a0364b2332.log
 cookie=0x0, duration=69.010s, table=100, n_packets=0, n_bytes=0, priority=100,ip,reg0=0x375028 actions=set_field:3a:12:2c:4d:13:b5->eth_dst,set_field:0x375028->pkt_mark,goto_table:101
 cookie=0x0, duration=4940910.956s, table=101, n_packets=147214, n_bytes=15075466, priority=0 actions=output:2

The flow seems correct but looks like there is an iptables SNAT rule missing in the table OPENSHIFT-MASQUERADE:
Chain OPENSHIFT-MASQUERADE (1 references)
 pkts bytes target     prot opt in     out     source               destination
8003K  480M MASQUERADE  all  --  *      *       <removed>.<removed>.0.0/16         0.0.0.0/0            /* masquerade pod-to-service and pod-to-external traffic */



Expected results:
The traffic always goes through the egressIP node

Additional info:
Sosreport with log level 5 attached in the case

Comment 10 Dan Winship 2018-12-04 14:59:30 UTC

OK, I believe the problem here is caused by the fact that the "dsc-score-test-dmz" project has been joined with the "dsc-score-infra-test-dmz" namespace (via "oc adm pod-network join-projects"), and the Egress IP code does not handle the case of joined projects. It *should* log an error about this, but unfortunately it does not.

There are three workarounds, none of which is necessarily great:

  - kill off the "dsc-score-infra-test-dmz" namespace, and just
    move everything from that namespace into the "dsc-score-test-dmz"
    namespace.

  - unjoin the namespaces
    ("oc adm pod-network isolate-project dsc-score-infra-test-dmz")
    and then figure out a different way to allow communication between
    them where that was needed.

  - switch from the multitenant plugin to the networkpolicy plugin
    (following the directions at
    https://docs.openshift.com/container-platform/3.9/install_config/configuring_sdn.html#migrating-between-sdn-plugins-networkpolicy)
    and then configure cross-namespace communication using
    NetworkPolicy rather than "oc adm pod-network". In this case,
    it is possible both to allow the two namespaces to talk to
    each other and to have dsc-score-test-dmz use an egress IP, and
    the migration script linked from the documentation should deal
    with creating policies that match the previous behavior created
    with "oc adm pod-network". However, the migration will require
    a bit of cluster downtime.

We should definitely fix OpenShift to log an error rather than failing silently and mysteriously in this case. I don't think we can commit to actually fixing the bug (making Egress IPs and multitenant joined projects work together correctly) in the short term, but that would be a question for PM not engineering anyway.

Comment 11 Daniel Del Ciancio 2019-09-04 14:18:57 UTC

(In reply to Dan Winship from comment #10)
> OK, I believe the problem here is caused by the fact that the
> "dsc-score-test-dmz" project has been joined with the
> "dsc-score-infra-test-dmz" namespace (via "oc adm pod-network
> join-projects"), and the Egress IP code does not handle the case of joined
> projects. It *should* log an error about this, but unfortunately it does not.
> 
> There are three workarounds, none of which is necessarily great:
> 
>   - kill off the "dsc-score-infra-test-dmz" namespace, and just
>     move everything from that namespace into the "dsc-score-test-dmz"
>     namespace.
> 
>   - unjoin the namespaces
>     ("oc adm pod-network isolate-project dsc-score-infra-test-dmz")
>     and then figure out a different way to allow communication between
>     them where that was needed.
> 
>   - switch from the multitenant plugin to the networkpolicy plugin
>     (following the directions at
>    
> https://docs.openshift.com/container-platform/3.9/install_config/
> configuring_sdn.html#migrating-between-sdn-plugins-networkpolicy)
>     and then configure cross-namespace communication using
>     NetworkPolicy rather than "oc adm pod-network". In this case,
>     it is possible both to allow the two namespaces to talk to
>     each other and to have dsc-score-test-dmz use an egress IP, and
>     the migration script linked from the documentation should deal
>     with creating policies that match the previous behavior created
>     with "oc adm pod-network". However, the migration will require
>     a bit of cluster downtime.
> 

> We should definitely fix OpenShift to log an error rather than failing
> silently and mysteriously in this case. I don't think we can commit to
> actually fixing the bug (making Egress IPs and multitenant joined projects
> work together correctly) in the short term, but that would be a question for
> PM not engineering anyway.

I see that this issue has affected several customers as well mine and would like to add emphasis on the need to include a fix in Openshift to handle/log an error when attempting to do this.
Also, including a note in the documentation explaining this limitation would definitely be a start.

Comment 12 Dan Winship 2019-10-23 12:04:47 UTC

(In reply to Daniel Del Ciancio from comment #11)
> > We should definitely fix OpenShift to log an error rather than failing
> > silently and mysteriously in this case. I don't think we can commit to
> > actually fixing the bug (making Egress IPs and multitenant joined projects
> > work together correctly) in the short term, but that would be a question for
> > PM not engineering anyway.
> 
> I see that this issue has affected several customers as well mine and would
> like to add emphasis on the need to include a fix in Openshift to handle/log
> an error when attempting to do this.
> Also, including a note in the documentation explaining this limitation would
> definitely be a start.

Bug 1764587 covers updating the docs

Note You need to log in before you can comment on or make changes to this bug.