Bug 2066834 - Hibernating cluster(s) in cluster pool stuck in 'Stopping' status after restore activation
Summary: Hibernating cluster(s) in cluster pool stuck in 'Stopping' status after resto...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: DR4Hub
Version: rhacm-2.5
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: rhacm-2.5
Assignee: vbirsan
QA Contact: Thuy Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-22 15:11 UTC by Thuy Nguyen
Modified: 2023-01-09 08:20 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-09 02:10:05 UTC
Target Upstream Version:
Embargoed:
bot-tracker-sync: rhacm-2.5+


Attachments (Terms of Use)
Cluster pool UI (321.71 KB, image/png)
2022-03-22 15:11 UTC, Thuy Nguyen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github stolostron backlog issues 21018 0 None None None 2022-03-22 16:38:43 UTC
Red Hat Product Errata RHSA-2022:4956 0 None None None 2022-06-09 02:10:14 UTC

Description Thuy Nguyen 2022-03-22 15:11:29 UTC
Created attachment 1867509 [details]
Cluster pool UI

Description of problem: Hibernating cluster(s) in cluster pool stuck in 'Stopping' status after restore activation


Version-Release number of selected component (if applicable):
ACM 2.5.0-DOWNSTREAM-2022-03-17-03-36-41 (Final S4)

How reproducible:


Steps to Reproduce:
1. Primary hub has a cluster pool including 1 running cluster and 1 hibernating cluster
2. Create backup for the primary hub
3. Shut down primary hub, restore activate on secondary hub

Actual results:
Hibernating cluster status shows 'Stopping' after restore completed on sec hub

Expected results:


Additional info:

oc get clusterpool -n default
NAME         SIZE   STANDBY   READY   BASEDOMAIN                      IMAGESET
az-pool-tn   2      2         0       az.dev06.red-chesterfield.com   img4.10.4-x86-64-appsub


oc get cd --all-namespaces
NAMESPACE                             NAME                                  INFRAID                       PLATFORM   REGION      VERSION   CLUSTERTYPE   PROVISIONSTATUS   POWERSTATE              AGE
az-pool-tn-7jrlm                      az-pool-tn-7jrlm                      az-pool-tn-7jrlm-97wmb        azure      centralus   4.10.4                  Provisioned       FailedToStartMachines   69m
az-pool-tn-nrj62                      az-pool-tn-nrj62                      az-pool-tn-nrj62-k8q4x        azure      centralus   4.10.4                  Provisioned                               69m
clc-test-aws-auto-sno-1647575763754   clc-test-aws-auto-sno-1647575763754   clc-test-aws-auto-sno-rxf2t   aws        us-east-2   4.9.9                   Provisioned       Running                 69m


oc get cd -n az-pool-tn-nrj62 az-pool-tn-nrj62 -oyaml
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  annotations:
    hive.openshift.io/cluster-pool-spec-hash: 23ab68cbc48e4895
    open-cluster-management.io/user-group: c3lzdGVtOnNlcnZpY2VhY2NvdW50cyxzeXN0ZW06c2VydmljZWFjY291bnRzOm9wZW4tY2x1c3Rlci1tYW5hZ2VtZW50LWJhY2t1cCxzeXN0ZW06YXV0aGVudGljYXRlZA==
    open-cluster-management.io/user-identity: c3lzdGVtOnNlcnZpY2VhY2NvdW50Om9wZW4tY2x1c3Rlci1tYW5hZ2VtZW50LWJhY2t1cDp2ZWxlcm8=
  creationTimestamp: "2022-03-22T13:55:07Z"
  finalizers:
  - hive.openshift.io/deprovision
  generation: 1
  labels:
    hive.openshift.io/cluster-platform: azure
    hive.openshift.io/cluster-region: centralus
    hive.openshift.io/version-major: "4"
    hive.openshift.io/version-major-minor: "4.10"
    hive.openshift.io/version-major-minor-patch: 4.10.4
    velero.io/backup-name: acm-resources-schedule-20220322134022
    velero.io/restore-name: restore-acm-passive-sync-acm-resources-schedule-20220322134022
  name: az-pool-tn-nrj62
  namespace: az-pool-tn-nrj62
  resourceVersion: "23187141"
  uid: fcae07ff-bb60-495d-a07c-768637ab5b06
spec:
  baseDomain: az.dev06.red-chesterfield.com
  clusterMetadata:
    adminKubeconfigSecretRef:
      name: az-pool-tn-nrj62-0-2lkrg-admin-kubeconfig
    adminPasswordSecretRef:
      name: az-pool-tn-nrj62-0-2lkrg-admin-password
    clusterID: d610fc8e-17f9-4fb5-bbb3-e0d900ffaf71
    infraID: az-pool-tn-nrj62-k8q4x
  clusterName: az-pool-tn-nrj62
  clusterPoolRef:
    namespace: default
    poolName: az-pool-tn
  controlPlaneConfig:
    servingCertificates: {}
  installed: true
  platform:
    azure:
      baseDomainResourceGroupName: domain
      credentialsSecretRef:
        name: az-pool-tn-nrj62-azure-creds
      region: centralus
  powerState: Hibernating
  provisioning:
    imageSetRef:
      name: img4.10.4-x86-64-appsub
    installConfigSecretRef:
      name: az-pool-tn-nrj62-install-config
  pullSecretRef:
    name: az-pool-tn-nrj62-pull-secret
status:
  conditions:
  - lastProbeTime: "2022-03-22T13:55:13Z"
    lastTransitionTime: "2022-03-22T13:55:13Z"
    message: ClusterSync has not yet been created
    reason: MissingClusterSync
    status: "True"
    type: SyncSetFailed
  - lastProbeTime: "2022-03-22T13:56:10Z"
    lastTransitionTime: "2022-03-22T13:55:08Z"
    message: 'Get "https://api.az-pool-tn-nrj62.az.dev06.red-chesterfield.com:6443/api?timeout=32s":
      dial tcp 20.221.106.172:6443: i/o timeout'
    reason: ErrorConnectingToCluster
    status: "True"
    type: Unreachable
  - lastProbeTime: "2022-03-22T13:55:08Z"
    lastTransitionTime: "2022-03-22T13:55:08Z"
    message: Control plane certificates are present
    reason: ControlPlaneCertificatesFound
    status: "False"
    type: ControlPlaneCertificateNotFound
  - lastProbeTime: "2022-03-22T13:55:13Z"
    lastTransitionTime: "2022-03-22T13:55:13Z"
    message: Cluster is provisioned
    reason: Provisioned
    status: "True"
    type: Provisioned
  - lastProbeTime: "2022-03-22T13:55:10Z"
    lastTransitionTime: "2022-03-22T13:55:10Z"
    message: no ClusterRelocates match
    reason: NoMatchingRelocates
    status: "False"
    type: RelocationFailed
  - lastProbeTime: "2022-03-22T13:55:09Z"
    lastTransitionTime: "2022-03-22T13:55:09Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: AWSPrivateLinkFailed
  - lastProbeTime: "2022-03-22T13:55:09Z"
    lastTransitionTime: "2022-03-22T13:55:09Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: AWSPrivateLinkReady
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ActiveAPIURLOverride
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: AuthenticationFailure
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ClusterInstallCompleted
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ClusterInstallFailed
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ClusterInstallRequirementsMet
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ClusterInstallStopped
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: DNSNotReady
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: DeprovisionLaunchError
  - lastProbeTime: "2022-03-22T13:55:08Z"
    lastTransitionTime: "2022-03-22T13:55:08Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: Hibernating
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: IngressCertificateNotFound
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: InstallImagesNotResolved
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: InstallLaunchError
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: InstallerImageResolutionFailed
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ProvisionFailed
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: ProvisionStopped
  - lastProbeTime: "2022-03-22T13:55:08Z"
    lastTransitionTime: "2022-03-22T13:55:08Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: Ready
  - lastProbeTime: "2022-03-22T13:55:07Z"
    lastTransitionTime: "2022-03-22T13:55:07Z"
    message: Condition Initialized
    reason: Initialized
    status: Unknown
    type: RequirementsMet
  installedTimestamp: "2022-03-22T13:55:07Z"

Comment 1 bot-tracker-sync 2022-04-21 19:34:15 UTC
G2Bsync 1105442425 comment 
 thuyn-581 Thu, 21 Apr 2022 16:26:36 UTC 
 G2BSync -
Validated on ACM 2.5.0-DOWNSTREAM-2022-04-21-01-26-54.

Comment 4 errata-xmlrpc 2022-06-09 02:10:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4956

Comment 5 Ethel M. Wells 2023-01-09 08:20:15 UTC
A commitment of appreciation is all together for sharing, I found a tremendous store of stimulating information here. A striking post, incredibly grateful and obliging that you will make on a very major level more posts like this one. https://www.paymydoctor.ltd/


Note You need to log in before you can comment on or make changes to this bug.