+++ This bug was initially created as a clone of Bug #1971640 +++ Description of problem: ======================= InfraEnv controller should always requeue for backend response HTTP StatusConflict (code 409) The InfraEnv controller is expected to get HTTP StatusConflict (code 409) from the backend for requests sent in a gap smaller than 10 seconds[1]. The backend generates that response without looking at the Generate ISO request content (which may either be valid or not). The image should always be on par with the spec. For that, each request must be served by the backend. Currently, the controller will not requeue in a case of 409, if the conditions indicate an error[2]. The controller should always requeue, and set RequeueAfter to something larger than 10 seconds to avoid repeated 409s. [1] https://github.com/openshift/assisted-service/blob/52dd09d9b935b35a5a92d146581bd142e3339940/internal/bminventory/inventory.go#L897-L908 [2] https://github.com/openshift/assisted-service/blob/01e814cbe3ddb273197c70b7767bceffe275a87a/internal/controller/controllers/infraenv_controller.go#L344-L346
Merged to ocm-2.3 https://github.com/openshift/assisted-service/pull/2073
Verified with ACM: 2.3.0-DOWNSTREAM-2021-07-09-20-21-59 Steps to verify: 1. trigger infraenv reconcile by patching the ssh key: oc patch infraenvs.agent-install.openshift.io sno-0 --type merge -p '{"spec":{"sshAuthorizedKey":"ssh-rsa <ssh_key_1>"}}' 2. sleep 3 seconds 3. trigger a second infraenv reconcile by patching a different ssh keyoc patch infraenvs.agent-install.openshift.io sno-0 --type merge -p '{"spec":{"sshAuthorizedKey":"ssh-rsa <ssh_key_2>"}}' The first attempt to patch the second ssh key fails as this is less than 10 seconds since the last request to generate the ISO: time="2021-07-11T08:28:25Z" level=info msg="InfraEnv Reconcile started" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).Reconcile" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:81" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af time="2021-07-11T08:28:25Z" level=info msg="prepare image for cluster daeee389-0377-4bc4-a078-f65a81bd488c" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).GenerateClusterISOInternal" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:850" go-id=3237 pkg=Inventory request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af time="2021-07-11T08:28:25Z" level=error msg="request came too soon after previous request" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).GenerateClusterISOInternal" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:904" go-id=3237 pkg=Inventory request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af time="2021-07-11T08:28:25Z" level=error msg="generate cluster ISO failed" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).GenerateClusterISOInternal.func1" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:875" go-id=3237 pkg=Inventory request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af time="2021-07-11T08:28:25Z" level=info msg="Image sno-0 being prepared for cluster sno-0" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).handleEnsureISOErrors" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:360" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af time="2021-07-11T08:28:25Z" level=info msg="InfraEnv Reconcile ended" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).Reconcile.func1" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:78" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=6aa1667d-b300-4fbb-9cc9-d5eead1230af The infraenv reconcile is re-queued ~20 secs later: time="2021-07-11T08:28:45Z" level=info msg="InfraEnv Reconcile started" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).Reconcile" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:81" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="prepare image for cluster daeee389-0377-4bc4-a078-f65a81bd488c" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).GenerateClusterISOInternal" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:850" go-id=3237 pkg=Inventory request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="Successfully uploaded file daeee389-0377-4bc4-a078-f65a81bd488c/discovery.ign" func="github.com/openshift/assisted-service/pkg/s3wrapper.(*FSClient).Upload" file="/remote-source/assisted-service/app/pkg/s3wrapper/filesystem.go:76" go-id=3237 request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="Creating minimal ISO for cluster daeee389-0377-4bc4-a078-f65a81bd488c" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).generateClusterMinimalISO.func1" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:1087" go-id=3237 pkg=Inventory request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="Uploading minimal ISO for cluster daeee389-0377-4bc4-a078-f65a81bd488c" func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).generateClusterMinimalISO" file="/remote-source/assisted-service/app/internal/bminventory/inventory.go:1103" go-id=3237 pkg=Inventory request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="Successfully uploaded file discovery-image-daeee389-0377-4bc4-a078-f65a81bd488c.iso" func="github.com/openshift/assisted-service/pkg/s3wrapper.(*FSClient).UploadStream" file="/remote-source/assisted-service/app/pkg/s3wrapper/filesystem.go:159" go-id=3237 request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 ... time="2021-07-11T08:28:45Z" level=info msg="ISODownloadURL changed from https://assisted-service-rhacm.apps.seal12-cluster-assisted-0.qe.lab.redhat.com/api/assisted-install/v1/clusters/daeee389-0377-4bc4-a078-f65a81bd488c/downloads/image?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiZGFlZWUzODktMDM3Ny00YmM0LWEwNzgtZjY1YTgxYmQ0ODhjIn0.GWj9Fa5clVTS7oh5NxSPVgKYg1sdtNR6eliqhz4cO4WMMYxQDcbRknbr5Kvc2QXiyCP3DFh8lHTM28MZiGBYVw to https://assisted-service-rhacm.apps.seal12-cluster-assisted-0.qe.lab.redhat.com/api/assisted-install/v1/clusters/daeee389-0377-4bc4-a078-f65a81bd488c/downloads/image?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiZGFlZWUzODktMDM3Ny00YmM0LWEwNzgtZjY1YTgxYmQ0ODhjIn0.WeRl73aq14ocUbhsohBunrYdEP2mEqSQz7jcL-lKKrB0YzCZ5-6EKX0eJZr9UcuWRXU6KQuI1H-hVPZie7gpYg" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).updateEnsureISOSuccess" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:328" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57 time="2021-07-11T08:28:45Z" level=info msg="InfraEnv Reconcile ended" func="github.com/openshift/assisted-service/internal/controller/controllers.(*InfraEnvReconciler).Reconcile.func1" file="/remote-source/assisted-service/app/internal/controller/controllers/infraenv_controller.go:78" go-id=3237 infra_env=sno-0 infra_env_namespace=sno-0 request_id=ed5e60b3-edc5-4eea-83da-fe03c1213f57
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438