Bug 2187277 - [Fusion-aaS] managed-fusion-agent.v2.0.11 csv failed in deployment
Summary: [Fusion-aaS] managed-fusion-agent.v2.0.11 csv failed in deployment
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-managed-service
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Ohad
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-17 10:35 UTC by suchita
Modified: 2023-08-09 17:00 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-19 06:05:25 UTC
Embargoed:


Attachments (Terms of Use)

Description suchita 2023-04-17 10:35:08 UTC
Description of problem:
managed-fusion-agent.v2.0.11 csv failed in deployment with the below error in log
-----------------------
2023-04-17T06:45:14.621Z    INFO    controllers.ManagedFusion    reconciling PrometheusProxyNetworkPolicy resources
2023-04-17T06:45:14.622Z    ERROR    controllers.ManagedFusion    An error was encountered during reconcilePhases    {"error": "failed to update egressFirewall: unable to get AWS IMDS ConfigMap: ConfigMap \"aws-data\" not found"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /go/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:214
2023-04-17T06:45:14.622Z    ERROR    controller-runtime.manager.controller.secret    Reconciler error    {"reconciler group": "", "reconciler kind": "Secret", "name": "builder-token-dntdr", "namespace": "openshift-logging", "error": "failed to update egressFirewall: unable to get AWS IMDS ConfigMap: ConfigMap \"aws-data\" not found"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /go/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /go/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:214
-----------------------

$ oc get csv -n openshift-storage
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
managed-fusion-agent.v2.0.11              Managed Fusion Agent          2.0.11                                                      Failed

Version-Release number of selected component (if applicable):
oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.12.12   True        False         6h15m

$ oc get csv
NAME                                      DISPLAY                  VERSION           REPLACES                                  PHASE
managed-fusion-agent.v2.0.11              Managed Fusion Agent     2.0.11                                                      Succeeded
observability-operator.v0.0.20            Observability Operator   0.0.20            observability-operator.v0.0.19            Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator      4.10.0                                                      Succeeded
route-monitor-operator.v0.1.494-a973226   Route Monitor Operator   0.1.494-a973226   route-monitor-operator.v0.1.493-a866e7c   Succeeded

$ oc get csv
ocNAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.11.6                      NooBaa Operator               4.11.6            mcg-operator.v4.11.5                      Succeeded
observability-operator.v0.0.20            Observability Operator        0.0.20            observability-operator.v0.0.19            Succeeded
ocs-operator.v4.11.6                      OpenShift Container Storage   4.11.6            ocs-operator.v4.11.5                      Succeeded
ocs-osd-deployer.v2.0.12                  OCS OSD Deployer              2.0.12            ocs-osd-deployer.v2.0.11                  Succeeded
odf-csi-addons-operator.v4.11.6           CSI Addons                    4.11.6            odf-csi-addons-operator.v4.11.5           Succeeded
odf-operator.v4.11.6                      OpenShift Data Foundation     4.11.6            odf-operator.v4.11.5                      Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0            ose-prometheus-operator.4.8.0             Succeeded
route-monitor-operator.v0.1.494-a973226   Route Monitor Operator        0.1.494-a973226   route-monitor-operator.v0.1.493-a866e7c   Succeeded


How reproducible:
4/4

Steps to Reproduce:
1. Deploy New managed Fusion agent using deployment https://docs.google.com/document/d/1Jdx8czlMjbumvilw8nZ6LtvWOMAx3H4TfwoVwiBs0nE/edit?usp=sharing
2. 
3.

Actual results:
managed-fusion-agent.v2.0.11 csv  in failed status

Expected results:
managed-fusion-agent.v2.0.11 csv should be succeeded. 

Additional info:
Workaround:
The issue is because there are a set of labels added to the namespace to ensure the pod security and because aws data gather pod requires host network the pod was not allowed to come up.
For now as a workaround please apply these labels to the managed-fusion namespace if you see this issue.

labels:
    kubernetes.io/metadata.name: managed-fusion
    pod-security.kubernetes.io/audit: baseline
    pod-security.kubernetes.io/audit-version: v1.24
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/warn: baseline
    pod-security.kubernetes.io/warn-version: v1.24
    security.openshift.io/scc.podSecurityLabelSync: "false"

Reference Threadlink: https://chat.google.com/room/AAAANBK1onY/fGOB7s8Or6E

Comment 1 Dhruv Bindra 2023-04-19 06:05:25 UTC
The pod-security.kubernetes.io/enforce label was added by the automation that QE team uses. It has been updated to not add the label and create a namespace using oc new-project command as that is the command that lambda will use to create the namespace. Closing the BZ as not a bug.


Note You need to log in before you can comment on or make changes to this bug.