Bug 2017547

Summary: Siteconfig application sync fails with The AgentClusterInstall is invalid: spec.provisionRequirements.controlPlaneAgents: Required value when updating images references
Product: OpenShift Container Platform Reporter: Marius Cornea <mcornea>
Component: Telco EdgeAssignee: Ian Miller <imiller>
Telco Edge sub component: ZTP QA Contact: Marius Cornea <mcornea>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified    
Version: 4.9   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:22:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2021512    

Description Marius Cornea 2021-10-26 19:26:33 UTC
Description of problem:

In an upgrade scenario from 4.8 to 4.9

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

1. Deploy a 4.8 hub and DU node via ZTP process
2. Upgrade hub to 4.9 OCP
3. Upgrade ACM to 2.4
4. Upgrade GitOps operator to 1.3
5. Destroy DU cluster from ACM
6. Update git repo with references to 4.9 images:

http://registry.kni-qe-0.lab.eng.rdu2.redhat.com:3000/kni-qe/ztp-site-configs/commit/5a58f3ea20a211eebee31c15d4d03b67113ded2e

7. Wait for siteconfig app to get in sync

Actual results:

clusterdeployment.hive.openshift.io/kni-qe-1 created
nmstateconfig.agent-install.openshift.io/sno.kni-qe-1.lab.eng.rdu2.redhat.com created
klusterletaddonconfig.agent.open-cluster-management.io/kni-qe-1 created
managedcluster.cluster.open-cluster-management.io/kni-qe-1 created
infraenv.agent-install.openshift.io/kni-qe-1 created
baremetalhost.metal3.io/sno.kni-qe-1.lab.eng.rdu2.redhat.com created
configmap/kni-qe-1 created
 The AgentClusterInstall "kni-qe-1" is invalid: spec.provisionRequirements.controlPlaneAgents: Required value
Traceback (most recent call last):
  File "watcher.py", line 100, in __init__
    check=True)
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['oc', 'apply', '-f', '/tmp/tmpe1nplg_l/update/customResource/site-plan-kni-qe-1/kni-qe-1.yaml']' returned non-zero exit status 1.
ztp-hooks.watcher 2021-10-26 18:32:01 UTC [ERROR]             [watcher:160]: Exception by ApiResponseParser: Failed to apply target manifests
Traceback (most recent call last):
  File "watcher.py", line 100, in __init__
    check=True)
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['oc', 'apply', '-f', '/tmp/tmpe1nplg_l/update/customResource/site-plan-kni-qe-1/kni-qe-1.yaml']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "watcher.py", line 148, in __init__
    OcWrapper('apply', out_upd_path)
  File "watcher.py", line 109, in __init__
    raise Exception(f"Failed to {action} target manifests")
Exception: Failed to apply target manifests


Expected results:
All resources are created without errors, triggering a DU deployment. 

Additional info:

Comment 1 Marius Cornea 2021-10-26 19:51:02 UTC
The same failure occurs even after deleting/re-creating the ArgoCD apps.

Comment 5 errata-xmlrpc 2022-03-10 16:22:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056