Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1600115

Summary: ping loss of first packet with OVN l3 logical router.
Product: Red Hat OpenStack Reporter: Eran Kuris <ekuris>
Component: openvswitchAssignee: lorenzo bianconi <lorenzo.bianconi>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: apevec, chrisw, jschluet, lhh, lmartins, majopela, mariel, nusiddiq, nyechiel, rhos-maint, shdunne, srevivo, tredaelli
Target Milestone: z4Keywords: Reopened, Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openvswitch-2.9.0-76.el7fdn Doc Type: Enhancement
Doc Text:
Previously, the first packet of a new connection using an OVN logical router was used to discover the MAC address of the destination. This resulted in the loss of the first packet on the new connection. This enhancement adds the capability to correctly queue the first packet of a new connection, which prevents the loss of that packet.
Story Points: ---
Clone Of:
: 1637466 1701893 (view as bug list) Environment:
Last Closed: 2019-04-22 07:15:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1637466, 1701893, 1728318, 1728674    

Description Eran Kuris 2018-07-11 13:03:49 UTC
Description of problem:
I have configured OVN L3 logical router for external traffic.
When pinging from VM to the external network or from the external network to the VM, the first time after bringing up ovn l3 router, there is ping loss of the first packet. Afterward, everything works fine even if pinging after
some interval and no packet loss is observed.

basically, if ovn doesn't know the mac for the next hop, it internally sends an arp packet and when it gets ht reply it stores the mac of that IP in mac_binding table so the first packet is dropped.

The issue is just the lack of packet buffering.  To fix it, we need to
implement packet buffering somehow.


Version-Release number of selected component (if applicable):
OSP13 2018-07-06.1

[root@controller-0 ~]# rpm -qa |grep ovn 
openvswitch-ovn-common-2.9.0-19.el7fdp.1.x86_64
openvswitch-ovn-host-2.9.0-19.el7fdp.1.x86_64
python-networking-ovn-metadata-agent-4.0.1-0.20180420150812.c7c16d4.el7ost.noarch
openvswitch-ovn-central-2.9.0-19.el7fdp.1.x86_64
python-networking-ovn-4.0.1-0.20180420150812.c7c16d4.el7ost.noarch
puppet-ovn-12.4.0-0.20180329043503.36ff219.el7ost.noarch


How reproducible:
always

Steps to Reproduce:
1. boot new vm on ovn setup 
2. add to him FIP 
3. ping from the external network to the vm you see that the first packet is lost.

Actual results:
the first packet lost

Expected results:
no packet loss

Additional info:
upstream thread 

https://mail.openvswitch.org/pipermail/ovs-discuss/2018-March/thread.html

https://mail.openvswitch.org/pipermail/ovs-discuss/2018-March/046403.html

Comment 1 Numan Siddique 2018-07-11 13:44:05 UTC
As mentioned in the u/s ML discussions, this is a known limitation and it is by design. It has to be fixed in openvswitch.


https://mail.openvswitch.org/pipermail/ovs-discuss/2018-March/046409.html

Comment 2 Lucas Alvares Gomes 2018-07-24 13:22:09 UTC
Since the fix should to go to openvswitch first, I'm changing the component for this bz.

Comment 4 Shelley Dunne 2018-10-12 15:34:09 UTC
Updating Target Milestone to z3 for all Modified medium bugs

Comment 19 Eran Kuris 2018-12-11 08:47:08 UTC
fix verified:
overcloud) [stack@undercloud-0 ~]$ ssh root.0.222
The authenticity of host '10.0.0.222 (10.0.0.222)' can't be established.
ECDSA key fingerprint is SHA256:RDa93tjACDaE1cZmnNkiPkWGLHYErQDXqwa5ouH1tbk.
ECDSA key fingerprint is MD5:b2:48:bd:78:27:07:a6:ad:1f:b8:ff:91:52:49:6a:ba.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.0.222' (ECDSA) to the list of known hosts.
root.0.222's password: 
[root@net-64-1-vm-1 ~]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=59.4 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=58.7 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=116 time=57.9 ms

OpenStack/13.0-RHEL-7/2018-12-07.1/

openvswitch-2.9.0-81.el7fdp.x86_64

Comment 23 errata-xmlrpc 2019-01-16 17:52:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0081

Comment 24 Eran Kuris 2019-04-18 12:54:49 UTC
I am hitting the same issue when I am working with VLAN tenant network type.
OpenStack/13.0-RHEL-7/2019-04-10.1
openvswitch-2.9.0-103.el7fdp.x86_64

Comment 25 Eran Kuris 2019-04-18 12:57:47 UTC
Following the above, I see the problem with traffic that goes from External network to the instance

--- 10.46.21.216 ping statistics ---
388 packets transmitted, 386 received, 0% packet loss, time 387044ms
rtt min/avg/max/mdev = 0.413/0.603/1.430/0.103 ms


2 packets lost

Comment 26 Jon Schlueter 2019-04-22 07:15:16 UTC
Eran,

Since this bug was shipped on an advisory please clone this bug or file another for failure of issue please.

Comment 27 Eran Kuris 2019-04-22 12:50:09 UTC
(In reply to Jon Schlueter from comment #26)
> Eran,
> 
> Since this bug was shipped on an advisory please clone this bug or file
> another for failure of issue please.

np I cloned this issue. https://bugzilla.redhat.com/show_bug.cgi?id=1701893