Hide Forgot
Description of problem: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334 CI run failed with cluster-image-registry-operator failing to complete. from https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/clusteroperators.json the image registry operator is reporting failure since `2019-03-01T23:51:37Z` ``` { "apiVersion": "config.openshift.io/v1", "kind": "ClusterOperator", "metadata": { "creationTimestamp": "2019-03-01T23:41:36Z", "generation": 1, "name": "image-registry", "resourceVersion": "19813", "selfLink": "/apis/config.openshift.io/v1/clusteroperators/image-registry", "uid": "8d76e26d-3c7b-11e9-8519-0ed54486e940" }, "spec": {}, "status": { "conditions": [ { "lastTransitionTime": "2019-03-01T23:51:37Z", "message": "Deployment does not exist", "status": "False", "type": "Available" }, { "lastTransitionTime": "2019-03-01T23:51:37Z", "message": "Unable to apply resources: unable to sync storage configuration: unable to get cluster minted credentials \"kube-system/installer-cloud-credentials\": timed out waiting for the condition", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-03-01T23:51:37Z", "status": "False", "type": "Failing" } ], "extension": null, "relatedObjects": null, "versions": [ { "name": "operator", "version": "4.0.0-87-gbf6c0c9-dirty" } ] } }, ``` from https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/clusteroperators.json the cloud-creds-operator is reporting completed creating creds since 2019-03-01T23:52:23Z ``` { "apiVersion": "config.openshift.io/v1", "kind": "ClusterOperator", "metadata": { "creationTimestamp": "2019-03-01T23:38:30Z", "generation": 1, "name": "openshift-cloud-credential-operator", "resourceVersion": "20301", "selfLink": "/apis/config.openshift.io/v1/clusteroperators/openshift-cloud-credential-operator", "uid": "1e79a670-3c7b-11e9-8519-0ed54486e940" }, "spec": {}, "status": { "conditions": [ { "lastTransitionTime": "2019-03-01T23:52:23Z", "message": "No credentials requests reporting errors.", "reason": "NoCredentialsFailing", "status": "False", "type": "Failing" }, { "lastTransitionTime": "2019-03-01T23:52:23Z", "message": "4 of 4 credentials requests provisioned and reconciled.", "reason": "ReconcilingComplete", "status": "False", "type": "Progressing" }, { "lastTransitionTime": "2019-03-01T23:38:30Z", "status": "True", "type": "Available" } ], "extension": null, "version": "" } }, ``` And the secret that cluster-image-registry-operator is waiting on was created at 2019-03-01T23:52:18Z ``` { "apiVersion": "v1", "data": { "aws_access_key_id": "QUtJQUpZTkVTVktaNEtNNk9MT1E=", "aws_secret_access_key": "WHp6QmpXd0E4SW5PTzZNTTZHV1VMU0cvTnc2WmgrSlVGKzhlemJFTg==" }, "kind": "Secret", "metadata": { "annotations": { "cloudcredential.openshift.io/aws-policy-last-applied": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:CreateBucket\",\"s3:DeleteBucket\",\"s3:PutBucketTagging\",\"s3:GetBucketTagging\",\"s3:PutEncryptionConfiguration\",\"s3:GetEncryptionConfiguration\",\"s3:PutLifecycleConfiguration\",\"s3:GetLifecycleConfiguration\",\"s3:GetBucketLocation\",\"s3:ListBucket\",\"s3:HeadBucket\",\"s3:GetObject\",\"s3:PutObject\",\"s3:DeleteObject\",\"s3:ListBucketMultipartUploads\",\"s3:AbortMultipartUpload\"],\"Resource\":\"*\"},{\"Effect\":\"Allow\",\"Action\":[\"iam:GetUser\"],\"Resource\":\"arn:aws:iam::460538899914:user/ci-op-3xwgvjmw-1d3f3-openshift-image-registry-5n6bs\"}]}", "cloudcredential.openshift.io/credentials-request": "openshift-cloud-credential-operator/openshift-image-registry" }, "creationTimestamp": "2019-03-01T23:52:18Z", "name": "installer-cloud-credentials", "namespace": "openshift-image-registry", "resourceVersion": "20249", "selfLink": "/api/v1/namespaces/openshift-image-registry/secrets/installer-cloud-credentials", "uid": "0c578f99-3c7d-11e9-8519-0ed54486e940" }, "type": "Opaque" }, ``` from https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/secrets.json And the installer reported failure to initialize at 2019-03-02T00:30:02Z from https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/installer/.openshift_install.log ``` time="2019-03-02T00:30:02Z" level=fatal msg="failed to initialize the cluster: Cluster operator image-registry has not yet reported success" ```
Digging into the actual operator logs, the operator is reporting missing region: E0302 00:28:33.356279 1 controller.go:222] unable to sync: unable to sync storage configuration: MissingRegion: could not find region configuration, requeuing logs are here: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/pods/openshift-image-registry_cluster-image-registry-operator-59f7c6cc56-62zrf_cluster-image-registry-operator.log.gz So a couple issues: 1) we're bubbling up the wrong error condition 2) i don't know why we'd be missing region information
one possibility is that because we failed to get the secret initially: E0301 23:51:37.805743 1 controller.go:222] unable to sync: unable to sync storage configuration: unable to get cluster minted credentials "kube-system/installer-cloud-credentials": timed out waiting for the condition, requeuing we ended up in a bad state in terms of region configuration that we could never get out of. anyway suffice to say the s3 config + credential management/syncing logic need to be looked into. We should be able to handle the scenario where the s3 cred secret isn't there when we come up, and shows up later.
https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1348/pull-ci-openshift-installer-master-e2e-aws/4334/artifacts/e2e-aws/configmaps.json: > \nplatform:\n aws:\n region: us-east-1\n The region is set in the kube-system/cluster-config-v1. Corey, do you know what can be the reason of "unable to sync storage configuration: MissingRegion: could not find region configuration"?
I'm looking into this, seems like that might be coming from the aws sdk.
it's certainly coming from the SDK, but presumably it implies we did not configure things properly when we invoked the sdk.....
In the latest pr https://github.com/openshift/installer/pull/1448 e2e logs, the testsuite "operator Run template e2e-aws - e2e-aws container setup" have passed. So this bug has been fixed with https://github.com/openshift/cluster-image-registry-operator/pull/238 https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_installer/1448/pull-ci-openshift-installer-master-e2e-aws/4564/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758