Bug 1717487 - EgressIP to Ingress IP not working
Summary: EgressIP to Ingress IP not working
Keywords:
Status: CLOSED DUPLICATE of bug 1762580
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.11.z
Assignee: Dan Winship
QA Contact: huirwang
URL:
Whiteboard: SDN-CUST-IMPACT,SDN-STALE
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-05 14:55 UTC by Juan Luis de Sousa-Valadas
Modified: 2020-06-16 13:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-16 13:18:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 4 Dan Winship 2019-06-05 17:49:54 UTC
kube-proxy also uses one bit of the packet mark to indicate when a packet will need to be masqueraded. However, it shouldn't be marking that on packets that are destined for an egress IP. Could they try to figure out what iptables rule is getting hit that results in the packet being sent to "-j KUBE-MARK-MASQ" ?

Comment 5 Juan Luis de Sousa-Valadas 2019-06-05 21:18:30 UTC
I don't think that rule is marking the packet. I undestand you're referring to this rule:
$ head -2 iptables_-t_nat_-nvL ; grep 'MARK or 0x1' iptables_-t_nat_-nvL 
Chain PREROUTING (policy ACCEPT 10M packets, 622M bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x1

But it has 0 matches. I don't see any rule that could possibly cause this behavior:

$ grep -Pi '0x(.*)?[13579BDE]' iptables_-*
iptables_-t_filter_-nvL:    0     0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x1/0x1
iptables_-t_nat_-nvL:    0     0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            MARK or 0x1
iptables_-t_nat_-nvL:    0     0 MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */ mark match 0x1/0x1
iptables_-vnxL:       0        0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes forwarding rules */ mark match 0x1/0x1

I don't see anywhere in openvswitch the packet being marked in a way that could set the first bit to 1
$ cat ovs-ofctl_-O_OpenFlow13_dump-flows_br0 | grep pkt_mark
 cookie=0x0, duration=668061.037s, table=100, n_packets=20084153, n_bytes=6475945226, priority=100,ip,reg0=0xfd592 actions=set_field:12:98:03:84:35:80->eth_dst,set_field:0xfd592->pkt_mark,goto_table:101

Could you please have a look at attachment 1577623 [details] ? I'm not so familiar with iptables mark, but I don't see a rule that could cause this.

Anyway it makes sense to me that there 1st bit is set to 1 rather than having it increased in one, I will ask the customer to modify the VNID of the netnamespace to an odd number.

Comment 6 Dan Winship 2019-06-05 21:36:11 UTC
> I will ask the customer to modify the VNID of the netnamespace to an odd number.

Note that the expected result of doing that is that the ovs flow will set pkt_mark to (VNID - 1) + 0x1000000. eg, if you set the VNID to 1037715, you'll get a pkt_mark of 0x10fd592, not 0xfd593. (Because OpenShift wants to avoid using the bit that kube-proxy is using.)

Comment 10 Dan Winship 2019-09-09 16:50:27 UTC
Sorry for the delay in getting back to this.

It seems that there are some incompatibilities in the iptables rules created for LoadBalancer/Ingress services, and the rules created for egress IPs. This means pods in namespaces using egress IPs may not be able to connect to load balancer IPs.

However, load balancer IPs are intended for use from outside the cluster anyway; pods inside the cluster can just connect to the service by its internal service IP, which will result in bouncing the packet across the network fewer times as well (The packet will go client-node -> server-node, rather than client-node -> egress-node -> load-balancer -> server-node).

Comment 18 Dan Winship 2020-06-16 13:18:10 UTC
> please note that this is also being followed via BZ#1762580,

Ok, we do not need two bugs actively tracking this. Closing this one as a dup; when a fix is available we can open new bugs for backports.

*** This bug has been marked as a duplicate of bug 1762580 ***


Note You need to log in before you can comment on or make changes to this bug.