Bug 2106855 - [4.11.z] externalTrafficPolicy=Local is not working in local gateway mode if ovnkube-node is restarted
Summary: [4.11.z] externalTrafficPolicy=Local is not working in local gateway mode if ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Surya Seetharaman
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On: 2106862
Blocks: 2107903
TreeView+ depends on / blocked
 
Reported: 2022-07-13 17:05 UTC by Divyam Pateriya
Modified: 2022-08-10 11:21 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 2107903 (view as bug list)
Environment:
Last Closed: 2022-08-10 11:21:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1200 0 None Merged [release-4.11] Bug 2106855: Append the SNAT rule in management chain 2022-08-02 18:32:54 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:21:47 UTC

Description Divyam Pateriya 2022-07-13 17:05:53 UTC
Description of problem: `externalTrafficPolicy=Local` is not working in local gateway mode with OCP 4.10


Version-Release number of selected component (if applicable):
OCP 4.10.20

How reproducible:
100%

Steps to Reproduce:
1. Configure spec.gatewayConfig.routingViaHost=true setting at network.operator/cluster CR

$ oc patch network.operator/cluster --type merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"gatewayConfig":{"routingViaHost":true}}}}}'

2. Confirm that `local` gateway mode is set correctly.

$ oc -n openshift-ovn-kubernetes get cm ovnkube-config  -o yaml | grep local
 mode=local  -----> Output


3. Create a simple application

$ oc -n <PROJECT> new-app -name hello --image=quay.io/redhattraining/hello-world-nginx

$ oc get pods -o wide

NAME                     READY   STATUS    RESTARTS   AGE    IP           NODE                                               NOMINATED NODE   READINESS GATES
hello-7f77b57c87-9f2rd   1/1     Running   0          114m   10.131.2.5   worker-2.example.com   <none>           <none>

4. Change the `ClusterIP` service to `NodePort` and add `externalTrafficPolicy: Local`

$ oc get svc hello -o yaml
spec:
  clusterIP: 172.30.221.147
  clusterIPs:
  - 172.30.221.147
  externalTrafficPolicy: Local
  ports:
  - name: 8080-tcp
    nodePort: 31919
  sessionAffinity: None
  type: NodePort

5. Access the NodePort service

$  curl worker-2.example.com:31919
<html>
  <body>
    <h1>Hello, world from nginx!</h1>
  </body>
</html>

6. Check the pods logs, to confirm if we got the real Source IP

$ oc exec  hello-7f77b57c87-9f2rd  -- tail -f /var/log/nginx/access.log
10.74.17.20 - - [13/Jul/2022:15:41:30 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.61.1"

7. Restart the `ovnkube-node` pod running on node `worker-2.example.com`

$ oc get pods -o wide -n openshift-ovn-kubernetes | grep worker-2
ovnkube-node-zvxzc     5/5     Running   0          79m   10.0.88.176   worker-2.example.com   <none>           <none>

$ oc -n openshift-ovn-kubernetes delete pod ovnkube-node-zvxzc 

8. Access the NodePort service again and check the pod logs

$ oc exec  hello-7f77b57c87-9f2rd  -- tail -f /var/log/nginx/access.log
10.74.17.20 - - [13/Jul/2022:15:41:30 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.61.1"
10.131.2.2 - - [13/Jul/2022:15:42:21 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.29.0"  -----> We got the IP from the Pod Network and this is not the real source IP.


Actual results:

Setting up the `externalTrafficPolicy=Local` functionality works when we create the service (i.e. the client IP address is visible to the pod), but then if I  delete the ovnkube-node pod, then it stops working (i.e. the source IP address is from the pod network) until I make a change to the service object again.

Expected results:
The real source IP should be visible in the pod logs, even after ovnkube-node pod restart or node reboot.

Additional info:

Comment 9 errata-xmlrpc 2022-08-10 11:21:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.