Bug 1920532

Summary: Problem in trying to connect through the service to a member that is the same as the caller.
Product: OpenShift Container Platform Reporter: Manish Pandey <mapandey>
Component: NetworkingAssignee: Michał Dulko <mdulko>
Networking sub component: kuryr QA Contact: Itzik Brown <itbrown>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: bbennett, ltomasbo, mdulko, mpatercz
Version: 4.6   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: For hairpin traffic (traffic originating in a member of a Service, redirected by the load balancer to the same member) OVN changes the source-IP of the packets to IP of the LB. This affects OSPs with ovn-octavia-provider. Consequence: If Network Policy is applied it may happen that such traffic will be unnecessarily blocked. Fix: Kuryr, when handling a Network Policy will also open traffic from IPs of all the Services in the NP's namespace. Result: Hairpin traffic will be allowed in OSP deployments using ovn-octavia-provider.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:36:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1959766    

Comment 1 Luis Tomas Bolivar 2021-01-26 14:19:56 UTC
I think this is not a Kuryr issue but an ovn/ovn-octavia issue

Comment 2 Michał Dulko 2021-02-03 17:21:05 UTC
(In reply to Luis Tomas Bolivar from comment #1)
> I think this is not a Kuryr issue but an ovn/ovn-octavia issue

This is only happening when I apply an NP. I was able to reproduce it with this one:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: networkpolicy-example
spec:
  podSelector: {}
  policyTypes:
  - Egress
  - Ingress
  ingress:
  - from:
    - podSelector: {}
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0

Comment 3 Luis Tomas Bolivar 2021-02-04 07:46:09 UTC
(In reply to Michał Dulko from comment #2)
> (In reply to Luis Tomas Bolivar from comment #1)
> > I think this is not a Kuryr issue but an ovn/ovn-octavia issue
> 
> This is only happening when I apply an NP. I was able to reproduce it with
> this one:
> 
> kind: NetworkPolicy
> apiVersion: networking.k8s.io/v1
> metadata:
>   name: networkpolicy-example
> spec:
>   podSelector: {}
>   policyTypes:
>   - Egress
>   - Ingress
>   ingress:
>   - from:
>     - podSelector: {}
>   egress:
>   - to:
>     - ipBlock:
>         cidr: 0.0.0.0/0

Ohh, without network policy it works as expected? Then it is a Kuryr issue messing with SGs

Comment 4 Michał Dulko 2021-02-05 12:17:12 UTC
Okay, after some investigation here are the findings. This is caused by OVN SNATing hairpin traffic with the Service/LB IP. This is not expected by Kuryr and we don't open that traffic on NP SGs. The fix for that will certainly be non-trivial. There are mitigations in OVN [1] that could make it easier to solve this, but they're not yet released, so we'll need go the hard way here.

Probably the solution will be to make sure all the ports being members of an LB will have an SG opening ingress on that LB's IP, port, protocol.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2021-January/379594.html

Comment 8 Itzik Brown 2021-05-12 08:19:08 UTC
Ran kuryr_tempest_plugin.tests.scenario.test_network_policy.NetworkPolicyScenario.test_network_policy_hairpin_traffic (From https://review.opendev.org/c/openstack/kuryr-tempest-plugin/+/788977) and it passed.

4.8.0-0.nightly-2021-05-10-225140
RHOS-16.1-RHEL-8-20210323.n.0

Comment 11 errata-xmlrpc 2021-07-27 22:36:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438