Bug 2066085 - Submariner (with Globalnet enabled) exported service - ip was not resolved after recreating same service
Summary: Submariner (with Globalnet enabled) exported service - ip was not resolved af...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Submariner
Version: rhacm-2.5
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: ---
Assignee: Sridhar Gaddam
QA Contact: Noam Manos
Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-20 15:56 UTC by Noam Manos
Modified: 2022-11-17 14:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-17 14:41:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github stolostron backlog issues 20941 0 None None None 2022-03-20 16:20:45 UTC
Github submariner-io submariner issues 1734 0 None open GlobalNet- exported service isn't reachable after service re creation 2022-03-20 15:56:36 UTC

Description Noam Manos 2022-03-20 15:56:37 UTC
**What happened**:

On cluster B: Re-Create a previously deleted service, that was already exported:

$ export KUBECONFIG=kubconf_nmanos-osp-skynet-b2

$ oc  get svc nginx-cl-bc -n test-submariner

NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
nginx-cl-bc   ClusterIP   100.97.80.188   <none>        8080/TCP   21m

$ oc  describe pod nginx-cl-bc-766cb5c66-8rdsz -n test-submariner

Name:         nginx-cl-bc-766cb5c66-8rdsz
Namespace:    test-submariner
Priority:     0
Node:         default-cl1-7cdfw-worker-0-ztzb8/10.167.2.132
Start Time:   Sun, 20 Mar 2022 14:49:36 +0200
Labels:       app=nginx-cl-bc
              pod-template-hash=766cb5c66
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "",
                    "interface": "eth0",
                    "ips": [
                        "10.228.2.5"
                    ],
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "",
                    "interface": "eth0",
                    "ips": [
                        "10.228.2.5"
                    ],
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: restricted
Status:       Running
IP:           10.228.2.5
IPs:
  IP:           10.228.2.5
Controlled By:  ReplicaSet/nginx-cl-bc-766cb5c66
Containers:
  nginx:
    Container ID:   cri-o://2ce08e7faa2d3e57513cbee4be14115548fd06e6e5cf23d18a2db879e2745d8f
    Image:          quay.io/bitnami/nginx
    Image ID:       quay.io/bitnami/nginx@sha256:d095971ce8f34adb65267f41d2bfa6eef7952375e55a1e6f06412519f4fee12d
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Sun, 20 Mar 2022 14:49:39 +0200
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-lhwqv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-lhwqv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-lhwqv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       179m  default-scheduler  Successfully assigned test-submariner/nginx-cl-bc-766cb5c66-8rdsz to default-cl1-7cdfw-worker-0-ztzb8
  Normal  AddedInterface  179m  multus             Add eth0 [10.228.2.5/23]
  Normal  Pulling         179m  kubelet            Pulling image "quay.io/bitnami/nginx"
  Normal  Pulled          179m  kubelet            Successfully pulled image "quay.io/bitnami/nginx" in 217.035951ms
  Normal  Created         179m  kubelet            Created container nginx
  Normal  Started         179m  kubelet            Started container nginx

$ oc get serviceexport -A

NAMESPACE         NAME          AGE
test-submariner   nginx-cl-bc   7h9m

$ oc describe serviceexport nginx-cl-bc

Name:         nginx-cl-bc
Namespace:    test-submariner
Labels:       <none>
Annotations:  <none>
API Version:  multicluster.x-k8s.io/v1alpha1
Kind:         ServiceExport
Metadata:
  Creation Timestamp:  2022-03-20T08:43:05Z
  Generation:          1
  Resource Version:    114469293
  Self Link:           /apis/multicluster.x-k8s.io/v1alpha1/namespaces/test-submariner/serviceexports/nginx-cl-bc
  UID:                 451a626e-978f-4c24-b7c3-a417f00ad525
Status:
  Conditions:
    Last Transition Time:  2022-03-20T12:39:33Z
    Message:               Service to be exported doesn't exist
    Reason:                ServiceUnavailable
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:39:33Z
    Message:               Service doesn't have a global IP yet
    Reason:                ServiceGlobalIPUnavailable
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:41:44Z
    Message:               Awaiting sync of the ServiceImport to the broker
    Reason:                AwaitingSync
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:41:44Z
    Message:               Service was successfully synced to the broker
    Reason:                
    Status:                True
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:43:49Z
    Message:               Service to be exported doesn't exist
    Reason:                ServiceUnavailable
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:43:49Z
    Message:               Service doesn't have a global IP yet
    Reason:                ServiceGlobalIPUnavailable
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:43:50Z
    Message:               Awaiting sync of the ServiceImport to the broker
    Reason:                AwaitingSync
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:43:51Z
    Message:               Service was successfully synced to the broker
    Reason:                
    Status:                True
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:49:37Z
    Message:               Service to be exported doesn't exist
    Reason:                ServiceUnavailable
    Status:                False
    Type:                  Valid
    Last Transition Time:  2022-03-20T12:49:37Z
    Message:               Service doesn't have a global IP yet
    Reason:                ServiceGlobalIPUnavailable
    Status:                False
    Type:                  Valid
Events:                    <none>



# On Cluster A - Try to reach the service IP: 

$ export KUBECONFIG=kubconf_nmanos-devcluster-a2-aws

$ oc get pods -n test-submariner

NAME                READY   STATUS      RESTARTS   AGE
netshoot-cl-a       1/1     Running     0          56m

$ oc  exec netshoot-cl-a -n test-submariner -- ping -c 1 nginx-cl-bc.test-submariner.svc.clusterset.local

ping: nginx-cl-bc.test-submariner.svc.clusterset.local: Name does not resolve


**What you expected to happen**:
Service should be reachable

**How to reproduce it (as minimally and precisely as possible)**:
As described above.

**Anything else we need to know?**:
# Cloud platform: Amazon

# OCP version: 4.10.0

# ACM version: 2.5.0

### Submariner components ###

subctl version: v0.12.0
Cluster "api-nmanos-devcluster-a2-aws-devcluster-openshift-com:6443"
 • Showing versions  ...
 ✓ Showing versions
COMPONENT                       REPOSITORY                                            VERSION         
submariner                      registry.redhat.io/rhacm2-tech-preview                v0.12.0

Comment 1 bot-tracker-sync 2022-03-22 12:01:05 UTC
G2Bsync 1075029648 comment 
 skitt Tue, 22 Mar 2022 11:02:17 UTC 
 G2BSync @vthapar could you look into this, and open an issue in Lighthouse if necessary?

Comment 6 Sridhar Gaddam 2022-05-16 11:26:56 UTC
Problem description:

Deploy submariner-addon on two (or more) clusters, wait for the connections to be successfully established.
Create a regular service (say service-A) in Cluster-A and export it.
Access the exported service from Cluster-B pod, this step is passing.
Now delete service-A.
Try to access the exported service from Cluster-B pod, it fails as expected.
Now re-create the service-A in Cluster-A
Try accessing the exported service once again. Ideally this step should pass, but its failing.

The reason its failing today is because Submariner Globalnet controller does not listen to Service events as there will be too many services. 
It basically listens to Service Export events and allocates/de-allocates globalIPs.
We can enhance Globalnet controller to support this use-case as part of the 0.13 release.


Work-around:
In the meantime, we have a work-around for this issue.
User can delete the Service Export object corresponding to Service-A and re-create it once again for the use-case to work.

Since we have an easy work-around for this problem, can we consider this as a low-priority issue?


Note You need to log in before you can comment on or make changes to this bug.