Bug 2108692 - CGU status does not reflect timeouts from earlier batches
Summary: CGU status does not reflect timeouts from earlier batches
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.10.z
Assignee: jun
QA Contact: yliu1
URL:
Whiteboard:
Depends On: 2108639
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-19 17:30 UTC by jun
Modified: 2023-07-04 15:16 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2087125
Environment:
Last Closed: 2023-07-04 15:16:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift-kni cluster-group-upgrades-operator pull 264 0 None open [release-4.10] Bug 2108692: Check all batches for upgrade complete 2022-08-17 01:01:17 UTC

Description jun 2022-07-19 17:30:35 UTC
+++ This bug was initially created as a clone of Bug #2087125 +++

Description of problem:
For an upgrade with multiple batches, the final CGU status becomes completed as long as the last batch becomes all compliant within the overall timeout period. Timeouts from earlier batches are not reflected.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. 
2.
3.

Actual results:


Expected results:
After last batch is completed, the final status should be set to completed only if all batches are compliant

Additional info:

Comment 2 yliu1 2022-08-24 20:22:02 UTC
Verification is currently blocked because fix for this bz is not in 4.10 yet: https://bugzilla.redhat.com/show_bug.cgi?id=2117038
Changing state to POST..

Comment 4 yliu1 2022-10-25 23:40:43 UTC
installplans now gets approved in about 1 minute using latest 4.11.2 TALM.

Comment 5 yliu1 2022-10-26 00:28:35 UTC
build used: topology-aware-lifecycle-manager.4.10.0-202210241500   Topology Aware Lifecycle Manager   4.10.0-202210241500                                      Succeeded
 
Overall status is timeout while second batch passed.

 status:
    computedMaxConcurrency: 1
    conditions:
    - lastTransitionTime: "2022-10-26T00:05:05Z"
      message: The ClusterGroupUpgrade CR policies are taking too long to complete
      reason: UpgradeTimedOut
      status: "False"
      type: Ready
    managedPoliciesContent:
      du-upgrade-cluster-version-policy1: "null"
    managedPoliciesForUpgrade:
    - name: du-upgrade-cluster-version-policy1
      namespace: ztp-upgrade
    managedPoliciesNs:
      du-upgrade-cluster-version-policy1: ztp-upgrade
    precaching:
      spec: {}
    remediationPlan:
    - - spoke-5
    - - spoke-3
    status:
      currentBatch: 2
      currentBatchRemediationProgress:
        spoke-3:
          state: Completed
      currentBatchStartedAt: "2022-10-26T00:15:05Z"
      startedAt: "2022-10-26T00:05:05Z"


Note You need to log in before you can comment on or make changes to this bug.