Bug 2238720 - [ACM Tracker] [IBMZ / MDR]: Ramen-dr-cluster-operator is not deployed on managed clusters after applying drpolicy
Summary: [ACM Tracker] [IBMZ / MDR]: Ramen-dr-cluster-operator is not deployed on mana...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.14.0
Assignee: Shyamsundar
QA Contact: avdhoot
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-13 10:36 UTC by Sravika
Modified: 2023-11-08 18:55 UTC (History)
11 users (show)

Fixed In Version: acm-operator-bundle-container-v2.9.0-150
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-08 18:54:31 UTC
Embargoed:


Attachments (Terms of Use)
ramen-hub-operator.log (317.87 KB, text/plain)
2023-09-13 10:36 UTC, Sravika
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:55:10 UTC

Description Sravika 2023-09-13 10:36:07 UTC
Description of problem (please be detailed as possible and provide log
snippests):

Ramen-dr-cluster-operator is not deployed on managed clusters after applying drpolicy and the drpolicy status show the managed clusters as DRClustersUnavailable. Only volsync csv is deployed on managed clusters.

hub:

# oc get drpolicy ocsm4205001-ocpm4202001 -oyaml
apiVersion: ramendr.openshift.io/v1alpha1
kind: DRPolicy
metadata:
  creationTimestamp: "2023-09-13T08:29:51Z"
  generation: 1
  name: ocsm4205001-ocpm4202001
  resourceVersion: "851658"
  uid: 91c92432-0e73-4e5f-a962-aa324487dab0
spec:
  drClusters:
  - ocsm4205001
  - ocpm4202001
  schedulingInterval: 0m
status:
  conditions:
  - lastTransitionTime: "2023-09-13T08:35:20Z"
    message: none of the DRClusters are validated ([ocsm4205001 ocpm4202001])
    observedGeneration: 1
    reason: DRClustersUnavailable
    status: "False"
    type: Validated

[root@a3e25001 ~]# oc get drclusters -A
NAME          AGE
ocpm4202001   101m
ocsm4205001   101m


[root@a3e25001 ~]# oc describe drclusters ocsm4205001
Name:         ocsm4205001
Namespace:
Labels:       cluster.open-cluster-management.io/backup=resource
Annotations:  <none>
API Version:  ramendr.openshift.io/v1alpha1
Kind:         DRCluster
Metadata:
  Creation Timestamp:  2023-09-13T08:35:20Z
  Finalizers:
    drclusters.ramendr.openshift.io/ramen
  Generation:        1
  Resource Version:  851682
  UID:               f5213978-ead9-4f9f-9c36-0fd50b511225
Spec:
  Region:         778d5284-ddf7-11ed-a790-525400c41d12
  s3ProfileName:  s3profile-ocsm4205001-ocs-external-storagecluster
Status:
  Conditions:
    Last Transition Time:  2023-09-13T08:35:20Z
    Message:               Cluster Clean
    Observed Generation:   1
    Reason:                Clean
    Status:                False
    Type:                  Fenced
    Last Transition Time:  2023-09-13T08:35:20Z
    Message:               Cluster Clean
    Observed Generation:   1
    Reason:                Clean
    Status:                True
    Type:                  Clean
    Last Transition Time:  2023-09-13T08:35:21Z
    Message:               DRCluster ManifestWork is not in applied state
    Observed Generation:   1
    Reason:                DrClustersDeployStatusCheckFailed
    Status:                False
    Type:                  Validated
  Phase:                   Available
Events:                    <none>
[root@a3e25001 ~]#

[root@a3e25001 ~]# oc describe drclusters ocpm4202001
Name:         ocpm4202001
Namespace:
Labels:       cluster.open-cluster-management.io/backup=resource
Annotations:  <none>
API Version:  ramendr.openshift.io/v1alpha1
Kind:         DRCluster
Metadata:
  Creation Timestamp:  2023-09-13T08:35:20Z
  Finalizers:
    drclusters.ramendr.openshift.io/ramen
  Generation:        1
  Resource Version:  851719
  UID:               be820d41-47a7-462b-bb0d-070934778443
Spec:
  Region:         778d5284-ddf7-11ed-a790-525400c41d12
  s3ProfileName:  s3profile-ocpm4202001-ocs-external-storagecluster
Status:
  Conditions:
    Last Transition Time:  2023-09-13T08:35:21Z
    Message:               Cluster Clean
    Observed Generation:   1
    Reason:                Clean
    Status:                False
    Type:                  Fenced
    Last Transition Time:  2023-09-13T08:35:21Z
    Message:               Cluster Clean
    Observed Generation:   1
    Reason:                Clean
    Status:                True
    Type:                  Clean
    Last Transition Time:  2023-09-13T08:35:22Z
    Message:               DRCluster ManifestWork is not in applied state
    Observed Generation:   1
    Reason:                DrClustersDeployStatusCheckFailed
    Status:                False
    Type:                  Validated
  Phase:                   Available
Events:                    <none>
[root@a3e25001 ~]#


mc1:

[root@m4205001 ~]# oc get csv,pod -n openshift-dr-system
NAME                                                                DISPLAY   VERSION   REPLACES                 PHASE
clusterserviceversion.operators.coreos.com/volsync-product.v0.7.4   VolSync   0.7.4     volsync-product.v0.7.3   Succeeded
[root@m4205001 ~]#


mc2:

[root@m4202001 ~]# oc get csv,pod -n openshift-dr-system
NAME                                                                DISPLAY   VERSION   REPLACES                 PHASE
clusterserviceversion.operators.coreos.com/volsync-product.v0.7.4   VolSync   0.7.4     volsync-product.v0.7.3   Succeeded
[root@m4202001 ~]#



Version of all relevant components (if applicable):

odf-multicluster-orchestrator: v4.14.0-132.stable
odr-hub-operator: v4.14.0-132.stable
volsync-product : v0.7.4 
odf-operator: v4.14.0-132.stable

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, MDR 4.14 testing is blocked


Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes, reproduced it twice consistently

Can this issue reproduce from the UI?
NA

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Configure the Metro DR environment (deploy RHCeph Storage, Install ODF in external mode on Managed clusters, Install ACM , MCO on hub cluster)
2. Configure SSL access across clusters
3. Import managed clusters on the hub cluster
4. Create Drpolicy on the hub cluster
     All Clusters → Data Services → Data policies->Create DRPolicy
5. Verify that the DRPolicy is created successfully.
 # oc get drpolicy <drpolicy_name> -o jsonpath='{.status.conditions[].reason}{"\n"}'

6. Verify that the OpenShift DR Cluster operator installation was successful on the Primary managed cluster and the Secondary managed cluster.

# oc get csv,pod -n openshift-dr-system


Actual results:
Ramen-dr-cluster-operator should be deployed on managed clusters after applying drpolicy

Expected results:
Ramen-dr-cluster-operator is not deployed on managed clusters after applying drpolicy

Additional info:

https://drive.google.com/file/d/1xOlrW211_hrAKQO5b2H6BcGQ168H8-Re/view?usp=sharing

Comment 3 Sravika 2023-09-13 10:36:54 UTC
Created attachment 1988631 [details]
ramen-hub-operator.log

Comment 10 Shyamsundar 2023-09-20 13:47:03 UTC
Fixed in later than acm-operator-bundle-container-v2.9.0-150 ACM 2.9 internal releases.

Comment 11 Sravika 2023-09-21 12:50:40 UTC
With the latest  acm-operator-bundle-container-v2.9.0-150 I do not see this anymore and the ramen-dr-cluster-operator is deployed on managed clusters after applying drpolicy.

# oc get sa -n open-cluster-management-agent
NAME                 SECRETS   AGE
builder              1         16h
default              1         16h
deployer             1         16h
klusterlet           1         16h
klusterlet-work-sa   1         16h

Comment 12 avdhoot 2023-10-16 06:30:31 UTC
Observation- 
---------------

With the latest  acm-operator-bundle-container-v2.9.0-165 . ramen-dr-cluster-operator is deployed on managed clusters after deleting and again applying drpolicy.

Also ramen-hub-operator restarted on hub and ramen-dr-cluster-operator restarted on primary managed cluster. No ramen-dr-cluster-operator restarts seen on secondary cluster.


Steps Performed:
---------------------
1. With existing Metro DR environment deleted all apps(subscription + Appset) and drpolicy.
2. Created new drpolicy with same name.
3. Created subscription set app.
4. Applied drpolicy tp app.


hub%  oc get pods -n openshift-operators  
NAME                                        READY   STATUS    RESTARTS      AGE
odf-multicluster-console-854b88488b-q6tpn   1/1     Running   0             4d21h
odfmo-controller-manager-8585fbddb8-jctpj   1/1     Running   2 (45h ago)   4d21h
ramen-hub-operator-7dc77db778-szcw4         2/2     Running   1 (45h ago)   4d21h

clust1%  oc get csv,pod -n openshift-dr-system
NAME                                                                                 DISPLAY                         VERSION             REPLACES                                  PHASE
clusterserviceversion.operators.coreos.com/odr-cluster-operator.v4.14.0-146.stable   Openshift DR Cluster Operator   4.14.0-146.stable   odr-cluster-operator.v4.14.0-139.stable   Succeeded
clusterserviceversion.operators.coreos.com/volsync-product.v0.7.4                    VolSync                         0.7.4               volsync-product.v0.7.3                    Succeeded

NAME                                            READY   STATUS    RESTARTS        AGE
pod/ramen-dr-cluster-operator-9c78ffc78-xqz9m   2/2     Running   1 (2d20h ago)   2d20h

Comment 15 avdhoot 2023-10-25 04:15:33 UTC
Raised new Bugzilla for issue mentioned in comment 12.
https://bugzilla.redhat.com/show_bug.cgi?id=2245230

Hence marking this BZ as verified.

Comment 18 errata-xmlrpc 2023-11-08 18:54:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832


Note You need to log in before you can comment on or make changes to this bug.