Bug 2034279 - Restore/backup shows up as Validation failed but the restore backup status in ACM shows success
Summary: Restore/backup shows up as Validation failed but the restore backup status in...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Cluster Lifecycle
Version: rhacm-2.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: rhacm-2.5
Assignee: Sahar Ebrahimi
QA Contact: Thuy Nguyen
Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-20 14:56 UTC by David Luong
Modified: 2022-06-09 02:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-09 02:07:01 UTC
Target Upstream Version:
Embargoed:
bot-tracker-sync: rhacm-2.5+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 18630 0 None None None 2021-12-20 17:32:51 UTC
Red Hat Product Errata RHSA-2022:4956 0 None None None 2022-06-09 02:07:34 UTC

Description David Luong 2021-12-20 14:56:03 UTC
Description of the problem:
ACM restore shows success while velero restore shows validation failed 

Release version:
2.4

Operator snapshot version:

OCP version:

Browser Info:

Steps to reproduce:
1.
2.
3.

Actual results:
```
oc describe restore
Name:         restore-acm
Namespace:    openshift-adp
Labels:       <none>
Annotations:  <none>
API Version:  cluster.open-cluster-management.io/v1beta1
Kind:         Restore
Metadata:
  Creation Timestamp:  2021-12-17T17:11:50Z
  Generation:          1
  Managed Fields:
    API Version:  cluster.open-cluster-management.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:veleroCredentialsBackupName:
        f:veleroManagedClustersBackupName:
        f:veleroResourcesBackupName:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2021-12-17T17:11:50Z
    API Version:  cluster.open-cluster-management.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:lastMessage:
        f:phase:
        f:veleroCredentialsRestoreName:
    Manager:         manager
    Operation:       Update
    Time:            2021-12-17T17:11:50Z
  Resource Version:  180363839
  UID:               77471152-7d7a-4723-b57b-26414bb4d89b
Spec:
  Velero Credentials Backup Name:       latest
  Velero Managed Clusters Backup Name:  latest
  Velero Resources Backup Name:         latest
Status:
  Last Message:                     Restore Complete restore-acm-acm-credentials-schedule-20211217164355
  Phase:                            Finished
  Velero Credentials Restore Name:  restore-acm-acm-credentials-schedule-20211217164355
Events:
  Type    Reason                   Age                   From                Message
  ----    ------                   ----                  ----                -------
  Normal  Velero Restore finished  45m (x56 over 2d21h)  Restore controller  restore-acm-acm-resources-schedule-20211217164402 finished
  Normal  Velero Restore finished  45m (x56 over 2d21h)  Restore controller  restore-acm-acm-managed-clusters-schedule-20211217164402 finished
  Normal  Velero Restore finished  45m (x56 over 2d21h)  Restore controller  restore-acm-acm-credentials-schedule-20211217164355 finished
  Normal  Velero Restore finished  45m (x58 over 2d21h)  Restore controller  restore-acm-acm-resources-schedule-20211217164402 finished
  Normal  Velero Restore finished  45m (x58 over 2d21h)  Restore controller  restore-acm-acm-managed-clusters-schedule-20211217164402 finished
  Normal  Velero Restore finished  45m (x58 over 2d21h)  Restore controller  restore-acm-acm-credentials-schedule-20211217164355 finished
```
oc describe restores.velero.io restore-acm-acm-credentials-schedule-20211217164355
```
I1220 09:54:16.560763   19924 request.go:645] Throttling request took 1.024575739s, request: GET:https://api.acm-ansible-ocp-02.cee.ral3.lab.eng.rdu2.redhat.com:6443/apis/view.open-cluster-management.io/v1beta1?timeout=32s
Name:         restore-acm-acm-credentials-schedule-20211217164355
Namespace:    openshift-adp
Labels:       <none>
Annotations:  <none>
API Version:  velero.io/v1
Kind:         Restore
Metadata:
  Creation Timestamp:  2021-12-17T17:11:50Z
  Generation:          2
  Managed Fields:
    API Version:  velero.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .:
          k:{"uid":"77471152-7d7a-4723-b57b-26414bb4d89b"}:
            .:
            f:apiVersion:
            f:blockOwnerDeletion:
            f:controller:
            f:kind:
            f:name:
            f:uid:
      f:spec:
        .:
        f:backupName:
        f:hooks:
      f:status:
    Manager:      manager
    Operation:    Update
    Time:         2021-12-17T17:11:50Z
    API Version:  velero.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:excludedResources:
      f:status:
        f:phase:
        f:validationErrors:
    Manager:    velero-server
    Operation:  Update
    Time:       2021-12-17T17:11:50Z
  Owner References:
    API Version:           cluster.open-cluster-management.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Restore
    Name:                  restore-acm
    UID:                   77471152-7d7a-4723-b57b-26414bb4d89b
  Resource Version:        172384463
  UID:                     413bc9de-e233-46f5-b5ca-2c0a3a225499
Spec:
  Backup Name:  acm-credentials-schedule-20211217164355
  Excluded Resources:
    nodes
    events
    events.events.k8s.io
    backups.velero.io
    restores.velero.io
    resticrepositories.velero.io
  Hooks:
Status:
  Phase:  FailedValidation
  Validation Errors:
    Error retrieving backup: BackupStorageLocation.velero.io "velero-sample-1" not found
```
Expected results:
Status is the same in both places

Additional info:

The validation fails sporadically with both backups and restores.

Comment 1 Sahar Ebrahimi 2021-12-20 17:10:40 UTC
Hi David, thanks for providing the information. Could you please check what version of the backup operator you are using, based on the values seen in the ACM restore CR above, it seemed that it's not the latest version, if true, please try to install the 2.4 version as below and let us know if you still see this issue, to enable the latest backup and restore operator:

The cluster backup and restore operator can be enabled when the MultiClusterHub resource is created for the first time. From the OpenShift Container Platform console, select the Enable Cluster Backup switch to enable the operator. The enableClusterBackup parameter is set to true. When the operator is enabled, the operator resources are installed.

More info here: https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.4/html/clusters/managing-your-clusters#backup-restore-enable 

Thanks.

Comment 2 David Luong 2021-12-20 17:25:46 UTC
I'm using the beta since I couldn't the stable to work with ACM 2.4.

Comment 3 Sahar Ebrahimi 2021-12-20 17:37:20 UTC
Sorry what do you mean by beta, did you mean OADP Operator from beta channel? I actually asked about the ACM backup operator, was wondering how you installed it.

Comment 4 David Luong 2021-12-20 17:40:08 UTC
Yes, I meant the OADP operator from the beta channel.  By the ACM backup operator, I just enabled it by setting the enableClusterBackup to True.  Wouldn't that be a prerequisite of showing the restore-acm object I listed in the bug first?  Is there something else that needed to happen? It is 2.4 GA.

Comment 8 errata-xmlrpc 2022-06-09 02:07:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4956


Note You need to log in before you can comment on or make changes to this bug.