Bug 1977450

Summary: [4.8.0] Fix flaky test: invalid NMState config YAML
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Nir Magnezi <nmagnezi>
Component: unspecifiedAssignee: Nir Magnezi <nmagnezi>
Status: CLOSED CURRENTRELEASE QA Contact: bjacot
Severity: high Docs Contact:
Priority: high    
Version: rhacm-2.3CC: aos-bugs, juhsu, yobshans
Target Milestone: ---Keywords: Triaged
Target Release: rhacm-2.3.1Flags: ming: rhacm-2.3.z+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AI-Team-Hive KNI-EDGE-4.8
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1977449 Environment:
Last Closed: 2021-09-21 17:28:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1977449    
Bug Blocks:    

Description Nir Magnezi 2021-06-29 18:55:52 UTC
+++ This bug was initially created as a clone of Bug #1977449 +++

Description of problem:
=======================
Originally reported here https://issues.redhat.com/browse/MGMT-5324

We had inconsistent results, as shown here:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_assisted-service/1456/pull-ci-openshift-assisted-service-master-subsystem-kubeapi-aws/1385056760468869120

The root cause is how InfraEnv used to requeue for errors, which changed with other fixes.
We will now requeue after 20s (as the backend limits image generation requests windows to be above 10s), and avoid requeue for non-recoverable issues such as BadRequest (invalid config).

Comment 11 ximhan 2021-08-20 07:26:57 UTC
OpenShift engineering has decided to NOT ship 4.8.6 on 8/23 due to the following issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1995785
All the fixes part will be now included in 4.8.7 on 8/30.

Comment 15 Mike Ng 2021-09-03 13:52:07 UTC
G2Bsync 912080871 comment 
 CrystalChun Thu, 02 Sep 2021 21:46:07 UTC 
 G2Bsync
Fix was merged as part of ACM 2.3 GA
Picked up in https://github.com/open-cluster-management/backlog/issues/14081

Comment 17 juhsu 2021-09-20 21:17:47 UTC
Nir, can you verify this works since ACM 2.3.1 and 2.3.2 builds should have this fix.  Thanks.