Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1861294

Summary: SNAT rule does not provide ARP response
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Alexander Constantinescu <aconstan>
Component: ovn2.13Assignee: Numan Siddique <nusiddiq>
Status: CLOSED ERRATA QA Contact: ying xu <yinxu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: RHEL 8.0CC: ctrautma, dcbw, huirwang, jishi, ralongi
Target Milestone: ---Keywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-16 16:01:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
nbdb
none
sbdb none

Description Alexander Constantinescu 2020-07-28 08:50:18 UTC
Description of problem:

This is a bug found on OpenShift running OVN-Kubernetes

When creating a NAT rule of type "snat" there are no ARP responses given to other "nodes" sending ARP requests for the "External IP" as specified in the NAT rule.   

Version-Release number of selected component (if applicable):

rpm -qa | grep ovn
ovn-central-20.06.1-4.fc31.x86_64
ovn-20.06.1-4.fc31.x86_64
ovn-host-20.06.1-4.fc31.x86_64

How reproducible:

Create a snat rule specifying an external and logical IP. Send a packet from the logical IP to an exterior component, the exterior component will perform ARP requests for the external IP (as to be able to provide the answer), the ARP request is never answered to by OVN and thus the response from the exterior component never reaches the logical IP.   

In the example I am providing we have the following:

Component      Logical IP     External IP
netserver-0    10.244.0.3     172.17.0.126

External Component, with IP:
172.17.0.5

tcpdump logs on the node hosting netserver-0, show the following:

$tcpdump -i any arp
08:36:45.831053 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28
08:36:45.831200 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28
08:36:46.861088 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28
08:36:46.861170 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28
08:36:47.885146 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28
08:36:47.885188 ARP, Request who-has 172.17.0.126 tell 172.17.0.5, length 28

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

I have provided the nbdb and sbdb in the attachments, please feel free to ask me for more information if necessary

Comment 1 Alexander Constantinescu 2020-07-28 08:51:55 UTC
Created attachment 1702625 [details]
nbdb

Comment 2 Alexander Constantinescu 2020-07-28 08:52:23 UTC
Created attachment 1702626 [details]
sbdb

Comment 3 Alexander Constantinescu 2020-07-28 08:53:07 UTC
I already spoke to Numan about this on Thursday last week, so he should be aware of the details. I am thus assigning directly to him.

Comment 4 Alexander Constantinescu 2020-07-28 08:56:13 UTC
FYI:

Using "dnat_and_snat" as NAT type does solve the problem. But we don't want to use that as external clients should not be able to target the external IP directly and reach the logical IP. We only want to allow egress traffic for the logical IP. Ingress traffic, targeting the external IP directly from an external client, should be dropped - as is the case if we specify "snat" an type.

Comment 5 Alexander Constantinescu 2020-07-28 08:58:25 UTC
FYI 2:

netserver-0 is hosted on node: ovn-worker2 in this example.

Comment 9 ying xu 2020-08-25 02:50:39 UTC
I reproduced this issue on version:
ovn2.13-20.06.1-6.el8fdp.x86_64

set the snat external ip as not the router ip, then, internal instance can't ping outside.

# ping 172.16.103.11
PING 172.16.103.11 (172.16.103.11) 56(84) bytes of data.

--- 172.16.103.11 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2003ms



verified on version:
ovn2.13-20.06.2-1.el8fdp.x86_64

#  ovn-nbctl lr-nat-list r1
TYPE             EXTERNAL_IP        EXTERNAL_PORT    LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
snat             172.16.102.2                        172.16.102.11


# ping 172.16.103.11
PING 172.16.103.11 (172.16.103.11) 56(84) bytes of data.
64 bytes from 172.16.103.11: icmp_seq=1 ttl=63 time=3.01 ms
64 bytes from 172.16.103.11: icmp_seq=2 ttl=63 time=0.455 ms
64 bytes from 172.16.103.11: icmp_seq=3 ttl=63 time=0.443 ms

--- 172.16.103.11 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.443/1.303/3.012/1.208 ms

Comment 11 errata-xmlrpc 2020-09-16 16:01:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3769