Bug 2074547 - Submariner Globalnet e2e tests failed on MTU between On-Prem to Public clusters
Summary: Submariner Globalnet e2e tests failed on MTU between On-Prem to Public clusters
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Submariner
Version: rhacm-2.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: rhacm-2.5.2
Assignee: Yossi Boaron
QA Contact: Noam Manos
Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-12 13:21 UTC by Noam Manos
Modified: 2022-09-13 20:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-13 20:06:21 UTC
Target Upstream Version:
Embargoed:
bot-tracker-sync: rhacm-2.5.z+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github stolostron backlog issues 21633 0 None None None 2022-04-12 16:32:04 UTC
Github stolostron backlog issues 22261#issuecomment-1128845467 0 None None None 2022-06-09 12:43:05 UTC
Github submariner-io submariner issues 1774 0 None open MTU issues seen between an OnPrem cluster vs Public cloud 2022-04-12 13:21:01 UTC
Red Hat Product Errata RHSA-2022:6507 0 None None None 2022-09-13 20:06:30 UTC

Description Noam Manos 2022-04-12 13:21:02 UTC
**What happened**:
Submariner Globalnet e2e tests failed on MTU between On-Prem to Public clusters:
https://qe-jenkins-csb-skynet.apps.ocp4.prod.psi.redhat.com/job/ACM-2.5.0-Submariner-0.12.0-AWS-OSP-Globalnet/134/testReport/

Test failure example:

 Submariner E2E suite.[dataplane-globalnet] Basic TCP connectivity tests across overlapping clusters without discovery when a pod connects via TCP to the globalIP of a remote service when the pod is not on a gateway and the remote service is not on a gateway should have sent the expected data from the pod to the other pod

 Stack Trace:

/home/jenkins/go/src/github.com/submariner-io/submariner/test/e2e/dataplane/tcp_gn_pod_connectivity.go:35
Expected
    <string>: listening on 0.0.0.0:1234 ...
    connect to 10.255.1.60:1234 from 242.0.0.2:33981 (242.0.0.2:33981)
    
to contain substring
    <string>: 95feeed9-6d3d-4fe2-ac30-8caab38cfbba
/home/jenkins/go/src/github.com/submariner-io/submariner/test/e2e/framework/dataplane.go:168

 Standard Output:

STEP: Creating namespace objects with basename "dataplane-gn-conn-nd"
STEP: Generated namespace "e2e-tests-dataplane-gn-conn-nd-brrkv" in cluster "acm-nmanos-devcluster-a2-aws" to execute the tests in
STEP: Creating namespace "e2e-tests-dataplane-gn-conn-nd-brrkv" in cluster "acm-nmanos-osp-skynet-b2"
STEP: Creating a listener pod in cluster "acm-nmanos-osp-skynet-b2", which will wait for a handshake over TCP
STEP: Pointing a ClusterIP service to the listener pod in cluster "acm-nmanos-osp-skynet-b2"
STEP: Creating a connector pod in cluster "acm-nmanos-devcluster-a2-aws", which will attempt the specific UUID handshake over TCP
Apr 12 10:22:40.096: INFO: ExecWithOptions &{Command:[sh -c for j in $(seq 50); do echo [dataplane] connector says 95feeed9-6d3d-4fe2-ac30-8caab38cfbba; done | for i in $(seq 3); do if nc -v 242.1.255.251 1234 -w 30; then break; else sleep 15; fi; done] Namespace:e2e-tests-dataplane-gn-conn-nd-brrkv PodName:customdn68q ContainerName:connector-pod Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:true}
STEP: Waiting for the listener pod "tcp-check-listenertlcgk" on node "default-cl2-dblg8-worker-0-lb96m" to exit, returning what listener sent
Apr 12 10:25:46.336: INFO: Pod "tcp-check-listenertlcgk" output:
listening on 0.0.0.0:1234 ...
connect to 10.255.1.60:1234 from 242.0.0.2:33981 (242.0.0.2:33981)

STEP: Verifying that the listener got the connector's data and the connector got the listener's data
STEP: Deleting namespace "e2e-tests-dataplane-gn-conn-nd-brrkv" on cluster "acm-nmanos-devcluster-a2-aws"
STEP: Deleting namespace "e2e-tests-dataplane-gn-conn-nd-brrkv" on cluster "acm-nmanos-osp-skynet-b2"


**What you expected to happen**:

E2E tests on a Globalnet env (overlapping CIDRs) should pass

**How to reproduce it (as minimally and precisely as possible)**:
1. Install one cluster behind NAT (On-prem) and another cluster public (e.g. AWS)
2. Install Submariner with Globalnet
3. Run Submariner E2E tests


**Anything else we need to know?**:

U/s issue:
https://github.com/submariner-io/submariner/issues/1774

**Environment**:

# Cloud platform: Amazon

# OCP version: 4.8.0

# ACM version: 2.5.0

# Cloud platform: Openstack

# OCP version: 4.9.0

### Submariner components ###

subctl version: v0.12.0
Cluster "nmanos-osp-skynet-b2"
 • Showing versions  ...
 ✓ Showing versions
COMPONENT                       REPOSITORY                                            VERSION         
submariner                      registry.redhat.io/rhacm2                             v0.12.0         
submariner-operator             registry.redhat.io/rhacm2                             e83e1e4b0abf2fb 
service-discovery               registry.redhat.io/rhacm2                             v0.12.0         

###############################################################
#       Images of Pods in namespace submariner-operator       #
###############################################################


### submariner-operator-bundle-index Image ###
id=image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-operator-bundle-index@sha256:d885ecb0d892c361c47381d63381bf5203fd08b370e959ab1e7c5127f2acc188
name=openshift/ose-operator-registry
release=202203120157.p0.g0c0d23c.assembly.stream
url =https://access.redhat.com/containers/#/registry.access.redhat.com/openshift/ose-operator-registry/images/v4.9.0-202203120157.p0.g0c0d23c.assembly.stream 
version=v4.9.0

### ocp-v4.0-art-dev Image ###
id=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3ee2b825aa8ecc5711b914e6e98ceb324b1960f34ab56a97e0b817e91c291c6c

### ocp-v4.0-art-dev Image ###
id=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bb7321686220855dfc3f435e0f653dbce87699e0e019c531a1629ec38c227984

### rhacm2-submariner-operator-bundle Image ###
id=registry-proxy.engineering.redhat.com/rh-osbs/rhacm2-submariner-operator-bundle@sha256:4b3b3d3eaf2eb26c0eb1f7c30c4cd8a7af1ff34077b7187654511a680a73acf4
name=rhacm2/submariner-operator-bundle
release=27
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/submariner-operator-bundle/images/v0.12.0-27 
version=v0.12.0

### lighthouse-agent-rhel8 Image ###
id=registry.redhat.io/rhacm2/lighthouse-agent-rhel8@sha256:c2e94eed47735410f3c38e3212e90225633ea2c55b7727eac1b367378cc36d96
name=rhacm2/lighthouse-agent-rhel8
release=14
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/lighthouse-agent-rhel8/images/v0.12.0-14 
version=v0.12.0

### lighthouse-coredns-rhel8 Image ###
id=registry.redhat.io/rhacm2/lighthouse-coredns-rhel8@sha256:77a96198848f8f35a26d1a6460587594546a3f4910e14a5977b2212a7596a2cf
name=rhacm2/lighthouse-coredns-rhel8
release=16
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/lighthouse-coredns-rhel8/images/v0.12.0-16 
version=v0.12.0

### submariner-addon-rhel8 Image ###
id=registry.redhat.io/rhacm2/submariner-addon-rhel8@sha256:0de8a495abd6874f1afb3b2f56045cb7c0c55b54c8d91b5911ab7e930a42c2d9

### submariner-gateway-rhel8 Image ###
id=registry.redhat.io/rhacm2/submariner-gateway-rhel8@sha256:86883aea450372cbf81948e24331755b48799ea2e86cb4a254612c268110e872
name=rhacm2/submariner-gateway-rhel8
release=18
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/submariner-gateway-rhel8/images/v0.12.0-18 
version=v0.12.0

### submariner-globalnet-rhel8 Image ###
id=registry.redhat.io/rhacm2/submariner-globalnet-rhel8@sha256:c5869974ff6caac49efcc9824c0f22fefd9e045f71979a3f97423c5064254f31
name=rhacm2/submariner-globalnet-rhel8
release=18
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/submariner-globalnet-rhel8/images/v0.12.0-18 
version=v0.12.0

### submariner-rhel8-operator Image ###
id=registry.redhat.io/rhacm2/submariner-rhel8-operator@sha256:ac899263b1e8fe70d2f4f1b314ba9e83e0cd6d9aa758c47a06092b4492c43cea
name=rhacm2/submariner-rhel8-operator
release=32
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/submariner-rhel8-operator/images/v0.12.0-32 
version=v0.12.0

### submariner-route-agent-rhel8 Image ###
id=registry.redhat.io/rhacm2/submariner-route-agent-rhel8@sha256:f6ad4530020510cd45ea02de48c3d5431300c8aff6bddad99cc80de040346bed
name=rhacm2/submariner-route-agent-rhel8
release=17
url =https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2/submariner-route-agent-rhel8/images/v0.12.0-17 
version=v0.12.0

Comment 1 Nir Yechiel 2022-04-13 07:15:02 UTC
The root cause is described in https://github.com/submariner-io/submariner/issues/1774.

Once a fix is available, we will assess possibility to backport into ACM 2.5.z.

Comment 8 Maayan Friedman 2022-06-09 12:41:17 UTC
We already merged the Release Notes mention this as a known issue.
The fix is merged upstream 0.12 branch and will be available on the next release

Comment 9 Noam Manos 2022-08-16 19:43:41 UTC
ACM 2.5.2 with Submariner 0.12.2 has been tested successfully with Globalnet, and between on the On-prem OSP to AWS clusters.

All E2E and system tests passed, including those which failed due to MTU issue before (on previous version 0.12.1) :
https://qe-jenkins-csb-skynet.apps.ocp-c1.prod.psi.redhat.com/job/ACM-2.5.2-Submariner-0.12.2-AWS-OSP-Globalnet/7/testReport/(root)/

Comment 14 errata-xmlrpc 2022-09-13 20:06:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Advanced Cluster Management 2.5.2 security fixes and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6507


Note You need to log in before you can comment on or make changes to this bug.