Bug 2090311

Summary: Submariner addon status doesn't track all deployment failures
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Vishal Thapar <vthapar>
Component: SubmarinerAssignee: Vishal Thapar <vthapar>
Status: CLOSED ERRATA QA Contact: Noam Manos <nmanos>
Severity: medium Docs Contact: Christopher Dawson <cdawson>
Priority: unspecified    
Version: rhacm-2.5CC: dfarrell, maafried, mbabushk, njean, nyechiel, skitt
Target Milestone: ---Flags: maafried: rhacm-2.5.z+
Target Release: rhacm-2.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ACM 2.6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-13 20:06:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vishal Thapar 2022-05-25 13:44:27 UTC
**What happened**:
When deploying submariner 0.12.0 or later through Submariner Addon, submariner ManagedClusterAddon tracks status of submariner deployments and reports "Agent Degraded" if any of them fails to come up. But it only tracks some of the deployments and if any of the untracked deployment fails, it erronously reports agent status as Ready.

Currently tracked:
submariner-gateway
submariner-routeagent
submariner-operator

Not tracked:
submariner-globalnet
submariner-networkplugin-syncer
lighthouse-agent
lighthouse-coredns

**What you expected to happen**:
When any of the untracked deployments fails, Agent Status should be Degraded.

**How to reproduce it (as minimally and precisely as possible)**:
To reproduce:

1. Install submariner-addon on a managed cluster.
2. Patch submarinerconfig for that managedcluster to use an invalid lighthouse-agent or lighthouse-coredns image.

kubectl patch submarinerconfigs submariner -n <managed-cluster-name> --type "json" -p '[
{"op":"add","path":"/spec/imagePullSpecs/lighthouseAgentImagePullSpec","value":"'quay.io/submariner/lighthouse-agent:invalid'"}]'

4. Check status of lighthouse-agent pod, it should be in Erorr state - ImagePullError.

5. Get manageclusteraddon submarinerfor that managed cluster

kuebctl get managedclusteraddon submariner -n <managed-cluster-name>


6. Check StatusTypeAgentDegraded. Even though lighthouse-agent pods are in error, StatusTypeAgentDegraded will be set to false.

  - lastTransitionTime: "2022-06-07T07:16:46Z"
    message: Submariner (submariner.v0.11.2) is deployed on managed cluster.   ---> This means everything is up
    reason: SubmarinerAgentDeployed
    status: "False" -----> This means everything is up.
    type: SubmarinerAgentDegraded

**Anything else we need to know?**:

**Environment**:
- Submariner version (use `subctl version`):
- Kubernetes version (use `kubectl version`):
- Diagnose information (use `subctl diagnose all`):
- Gather information (use `subctl gather`)
- Cloud provider or hardware configuration:
- OS (e.g `cat /etc/os-release`):
- Kernel (e.g `uname -a`):
- Install tools:
- Others:

Comment 1 Stephen Kitt 2022-07-04 10:11:44 UTC
Fixed by https://github.com/stolostron/submariner-addon/pull/378

Comment 5 Maxim Babushkin 2022-08-02 14:05:11 UTC
The patch has been verified on the acm 2.6 / submariner 0.13.0

Comment 10 errata-xmlrpc 2022-09-13 20:06:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Advanced Cluster Management 2.5.2 security fixes and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6507