Bug 2037914

Summary: [vSphere] TargetDown alerts for submariner components in managed clusters
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Sidhant Agrawal <sagrawal>
Component: SubmarinerAssignee: Maayan Friedman <maafried>
Status: CLOSED CURRENTRELEASE QA Contact: Noam Manos <nmanos>
Severity: low Docs Contact: Christopher Dawson <cdawson>
Priority: unspecified    
Version: rhacm-2.4CC: dfarrell, nyechiel, ocs-bugs, prsurve
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-16 08:24:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshots_and_submariner_logs none

Description Sidhant Agrawal 2022-01-06 20:02:01 UTC
Created attachment 1849338 [details]
screenshots_and_submariner_logs

**What happened**:

There are two TargetDown alerts being reported for submariner components in both Managed cluster which are misleading because the ACM console shows status as Healthy.

Alert details:
```
100% of the submariner-lighthouse-coredns/submariner-lighthouse-coredns targets in submariner-operator namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.

50% of the submariner-gateway-metrics/submariner-gateway-metrics targets in submariner-operator namespace have been unreachable for more than 15 minutes. This may be a symptom of network connectivity issues, down nodes, or failures within these components. Assess the health of the infrastructure and nodes running these targets and then contact support.
```

**What you expected to happen**:

No false alerts in managed clusters.


**How to reproduce it (as minimally and precisely as possible)**:

1. Install ACM
2. Import 2 Managed clusters with non-overlapping networks
3. Connect the Managed clusters using Submariner add-ons via console
Observe the console for alerts related to submariner component in managed clusters.


**Anything else we need to know?**:

In the ACM console everything looks fine with Connection and Agent status as Healthy for both managed clusters.
The below results from `subctl verify kubeconfig.c1 kubeconfig.c2 --only connectivity,service-discovery` also looks good.
```
Ran 23 of 41 Specs in 641.705 seconds
SUCCESS! -- 23 Passed | 0 Failed | 0 Pending | 18 Skipped
```
Only the alerts in managed clusters are indicating issues with submariner.


**Environment**:
- Platform: VMware vSphere
- Versions:
    OCP: 4.9.0-0.nightly-2021-12-23-045233
    RHACM: 2.4.1
    Submariner: 0.11.0
- Submariner version & image repository, Diagnose information, Gather information from both managed clusters can be found in the attachment.

Comment 4 Daniel Farrell 2022-05-26 12:50:09 UTC
Submariner had no support for Vsphere in 0.11, and in (upcoming) 0.12 it's tech preview. I think there have been some relevant changes in the UI and how we pass statuses. It's possible this was fixed, but it would be good to re-test with ACM 2.5 and SubM 0.12.*.

Comment 6 Nir Yechiel 2022-06-16 08:24:37 UTC
This we retested with ACM 2.5/Submariner 0.12.1 and seems to be working fine now.