Bug 1464061 - Traffic between two VMs having FIP is not working if the VMs are in the same compute node
Summary: Traffic between two VMs having FIP is not working if the VMs are in the same ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: opendaylight
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: Upstream M3
: 13.0 (Queens)
Assignee: Aswin Suryanarayanan
QA Contact: Itzik Brown
URL:
Whiteboard:
Depends On: 1501418
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-22 11:09 UTC by Aswin Suryanarayanan
Modified: 2018-10-18 07:21 UTC (History)
19 users (show)

Fixed In Version: opendaylight-8.0.0-3.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1501415 (view as bug list)
Environment:
N/A
Last Closed: 2018-06-27 13:31:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenDaylight Bug NETVIRT-430 0 None None None 2018-03-14 13:10:39 UTC
OpenDaylight gerrit 69447 0 None None None 2018-03-22 06:57:15 UTC
Red Hat Bugzilla 1475273 0 high CLOSED [ODL/NetVirt] Traffic between two VMs having FIP is not working if the VMs are in the same compute node 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2018:2086 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 13.0 Enhancement Advisory 2018-06-28 19:51:39 UTC

Internal Links: 1475273

Description Aswin Suryanarayanan 2017-06-22 11:09:33 UTC
Description of problem:
Traffic between Two VM having FIP is not working if the VM are in the same compute node when Openstack is installed with Opendaylight as the network controller.

The packet is dropped by security groups which is implemented using ovs-conntrack. The netfilter fails to receive some of the packets submitted from the pipeline and marks it as invalid.

Version-Release number of selected component (if applicable):


How reproducible:
A Openstack setup with opendaylight is required.
Steps to Reproduce:
1.Spawn two VM in the same compute node. 
2.Assosiate  FIP both the vms
3.SSH from vm1 to vm2 using the FIP

Actual results:
SSH should succeed.

Expected results:
SSH is failing

Additional info: Thread regarding the issue ovs-discuss[1]. A similar issue is observed in Ovn controller as well.

[1]https://mail.openvswitch.org/pipermail/ovs-discuss/2017-June/044613.html

Comment 1 Flavio Leitner 2017-06-22 18:06:54 UTC
Please attach a sosreport from the system reproducing the issue.

Comment 3 Numan Siddique 2017-06-22 18:35:04 UTC
The issue can be reproduces using this script here [1] when OVN is used.

[1] - https://gist.github.com/russellb/4ab0a9641f12f8ac66fdd6822ee7789e

Comment 4 Numan Siddique 2017-06-22 18:37:20 UTC
I tried fixing the issue and proposed the RFC patch - https://patchwork.ozlabs.org/patch/739796/, but that was not the right approach.
Please see the comments for more details.

Comment 6 Aswin Suryanarayanan 2017-06-23 08:34:02 UTC
(In reply to Flavio Leitner from comment #1)
> Please attach a sosreport from the system reproducing the issue.

The issue can be reproduced with
two namespace using the steps in [1] in ovs 2.7.

With [1]

>From 10.100.5.8 if I try to ping/ssh 10.100.5.9 it works, but not when I
try ping/ssh to 192.168.56.32 from 10.100.5.8.

But it seems to work if I track them in two different ct zones as below(in
40,41,251,252)

"table=40,priority=61010,ip,dl_src=fa:16:3e:1d:3d:01,nw_src=10.100.5.8,actions=ct(table=41,zone=5001)"
"table=40,priority=61010,ip,dl_src=fa:16:3e:13:85:be,nw_src=10.100.5.9,actions=ct(table=41,zone=5002)"

"table=41,priority=1000,ct_state=+new+trk,ip,dl_src=fa:16:3e:1d:3d:01,nw_src=10.100.5.8,actions=ct(commit,zone=5001),resubmit(,21)"
"table=41,priority=1000,ct_state=+new+trk,ip,dl_src=fa:16:3e:13:85:be,nw_src=10.100.5.9,actions=ct(commit,zone=5002),resubmit(,21)"

[1]https://gist.github.com/aswinsuryan/c22919576ae19e14ed489bf1f6c668cb

Comment 7 Nir Yechiel 2017-07-05 13:49:43 UTC
This bug affects both OVN and OpenDaylight, and therefore is high prio for RHOSP use cases.

Comment 9 Eric Garver 2017-07-05 18:48:27 UTC
(In reply to Aswin Suryanarayanan from comment #6)
> (In reply to Flavio Leitner from comment #1)
> > Please attach a sosreport from the system reproducing the issue.
> 
> The issue can be reproduced with
> two namespace using the steps in [1] in ovs 2.7.

I verified that it affects current upstream/master as well.

> 
> With [1]
> 
> >From 10.100.5.8 if I try to ping/ssh 10.100.5.9 it works, but not when I
> try ping/ssh to 192.168.56.32 from 10.100.5.8.
> 
> But it seems to work if I track them in two different ct zones as below(in
> 40,41,251,252)
> 
> "table=40,priority=61010,ip,dl_src=fa:16:3e:1d:3d:01,nw_src=10.100.5.8,
> actions=ct(table=41,zone=5001)"
> "table=40,priority=61010,ip,dl_src=fa:16:3e:13:85:be,nw_src=10.100.5.9,
> actions=ct(table=41,zone=5002)"
> 
> "table=41,priority=1000,ct_state=+new+trk,ip,dl_src=fa:16:3e:1d:3d:01,
> nw_src=10.100.5.8,actions=ct(commit,zone=5001),resubmit(,21)"
> "table=41,priority=1000,ct_state=+new+trk,ip,dl_src=fa:16:3e:13:85:be,
> nw_src=10.100.5.9,actions=ct(commit,zone=5002),resubmit(,21)"
> 
> [1]https://gist.github.com/aswinsuryan/c22919576ae19e14ed489bf1f6c668cb

I also verified that using different zones works. So that's the current work around at the moment.

Comment 10 Numan Siddique 2017-07-06 09:41:16 UTC
I did some testing locally and I shared my observations here - https://mail.openvswitch.org/pipermail/ovs-discuss/2017-July/044879.html.

Looks to me, either using a different zone as Eric mentioned or by-passing connection tracking for icmp packets for router ip seems to me the work around.

Comment 21 Nir Yechiel 2017-07-26 11:21:56 UTC
BZ 1475273 was reported to track an immediate fix in OpenDaylight/Netvirt. 

This bug is going to be used to track a long term fix in OVS.

Comment 27 Aswin Suryanarayanan 2018-02-22 15:51:20 UTC
Once the dependent ovs bug is merged, the temporary work around needs to be removed and we need to use the new ct_clear action in ODL pipeline.

Comment 29 Itzik Brown 2018-03-26 02:22:11 UTC
Verified with
ovs 2.9.0
opendaylight-8.0.0-3.el7ost.noarch

Comment 31 errata-xmlrpc 2018-06-27 13:31:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.