Created attachment 1675932 [details] full openshift install log Description of problem: In the default installation, all master and worker volumes are encrypted by default AWS KMS key. So the worker volumes should be encrypted by the custom KMS key, if I specify a valid one. But this is not work, the cluster can not be created successfully if I specify a custom key on worker volumes. configuration: apiVersion: v1 baseDomain: qe.devcluster.openshift.com compute: - architecture: amd64 hyperthreading: Enabled name: worker platform: aws: rootVolume: kmsKeyARN: arn:aws:kms:us-east-2:301721915996:key/4f5265b4-16f7-4d85-9a09-7209ab0c8456 replicas: 3 controlPlane: architecture: amd64 hyperthreading: Enabled name: master platform: {} replicas: 3 metadata: creationTimestamp: null name: yunjiang-usv networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: aws: region: us-east-2 publish: External pullSecret: xxx sshKey: XXX How reproducible: Always. Steps to Reproduce: 1. Create KMS key with "Key users" policy. 2. Create install config file and Specify above KMS key in worker node 4. Create cluster Actual results: Failed to create cluster. (full log is attached) time="2020-04-03T00:47:00-04:00" level=debug msg="Still waiting for the cluster to initialize: Working towards 4.5.0-0.nightly-2020-03-29-195504: 99% complete, waiting on authentication, console, csi-snapshot-controller, image-registry, ingress, kube-storage-version-migrator, monitoring" time="2020-04-03T00:50:11-04:00" level=debug msg="Still waiting for the cluster to initialize: Cluster operator console is reporting a failure: RouteHealthDegraded: failed to GET route (https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health: EOF" time="2020-04-03T01:14:45-04:00" level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingSubsets::RouteStatus_FailedHost: IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server\nRouteStatusDegraded: route is not available at canonical host oauth-openshift.apps.yunjiang-usv.qe.devcluster.openshift.com: []" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator authentication Progressing is Unknown with NoData: " time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator authentication Available is Unknown with NoData: " time="2020-04-03T01:14:45-04:00" level=error msg="Cluster operator console Degraded is True with RouteHealth_FailedGet: RouteHealthDegraded: failed to GET route (https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health: EOF" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator console Progressing is True with OAuthClientSync_FailedHost::RouteSync_FailedHost: RouteSyncProgressing: route is not available at canonical host []\nOAuthClientSyncProgressing: waiting on route host" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator console Available is False with Route_FailedAdmittedIngress: RouteAvailable: console route is not admitted" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator image-registry Available is False with NoReplicasAvailable: The deployment does not have available replicas" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator image-registry Progressing is True with DeploymentNotCompleted: The deployment has not completed" time="2020-04-03T01:14:45-04:00" level=error msg="Cluster operator ingress Degraded is True with IngressControllersDegraded: Some ingresscontrollers are degraded: default" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available.\nMoving to release version \"4.5.0-0.nightly-2020-03-29-195504\".\nMoving to ingress-controller image version \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:559e38c1171467ee375f2ea873495624920accf3ae0ff4b99cae98964e708897\"." time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator ingress Available is False with IngressUnavailable: Not all ingress controllers are available." time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator insights Disabled is False with : " time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator kube-storage-version-migrator Available is False with _NoMigratorPod: Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available" time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator monitoring Available is False with : " time="2020-04-03T01:14:45-04:00" level=info msg="Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack." time="2020-04-03T01:14:45-04:00" level=error msg="Cluster operator monitoring Degraded is True with UpdatingUserWorkloadThanosRulerFailed: Failed to rollout the stack. Error: running task Updating User Workload Thanos Ruler failed: failed to retrieve Grafana datasources config: secrets \"grafana-datasources\" not found" time="2020-04-03T01:14:45-04:00" level=fatal msg="failed to initialize the cluster: Cluster operator console is reporting a failure: RouteHealthDegraded: failed to GET route (https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.yunjiang-usv.qe.devcluster.openshift.com/health: EOF" time="2020-04-03T01:17:45-04:00" level=debug msg="OpenShift Installer 4.5.0-0.nightly-2020-03-29-195504" time="2020-04-03T01:17:45-04:00" level=debug msg="Built from commit 6aea75e9e5760924991c2a38c021d7f835aef296" time="2020-04-03T01:17:45-04:00" level=debug msg="Fetching Install Config..." time="2020-04-03T01:17:45-04:00" level=debug msg="Loading Install Config..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading SSH Key..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Using SSH Key loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Base Domain..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Platform..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Using Platform loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg=" Using Base Domain loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Cluster Name..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Base Domain..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Platform..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Using Cluster Name loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Pull Secret..." time="2020-04-03T01:17:45-04:00" level=debug msg=" Using Pull Secret loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg=" Loading Platform..." time="2020-04-03T01:17:45-04:00" level=debug msg="Using Install Config loaded from state file" time="2020-04-03T01:17:45-04:00" level=debug msg="Reusing previously-fetched Install Config" time="2020-04-03T01:17:45-04:00" level=fatal msg="failed to get bootstrap and control plane host addresses from \"ikms_us_verify/terraform.tfstate\": failed to lookup bootstrap: resource not found" Expected results: * Cluster is created successfully * worker volumes are encrypted by above custom KMS key. * master volumes are encrypted by AWS default KMS key. Additional info: Attempts: 1. Create cluster by default configuration * Cluster created successfully * All master and worker volumes are encrypted by default AWS KMS key. 2. Apply custom KMS key ONLY on master volumes * Cluster created successfully * Master volumes are encrypted by custome AWS KMS key. * Worker volumes are encrypted by default AWS KMS key.
Can you attach the oc adm must-gathe ?
1. Create cluster by default configuration * Cluster created successfully * All master and worker volumes are encrypted by default AWS KMS key. 2. Apply custom KMS key ONLY on master volumes * Cluster created successfully * Master volumes are encrypted by custome AWS KMS key. * Worker volumes are encrypted by default AWS KMS key. ^^ these succeed.. 3. Apply custom KMS key ONLY on worker volumes * Cluster created successfully * Master volumes are encrypted by default AWS KMS key. * Worker volumes should be encrypted with custom AWS KMS key, failing.. ^^ (3) failing and (2) working makes it sound like the machine-api is failing..
Created attachment 1676807 [details] must gather log
(In reply to Abhinav Dahiya from comment #1) > Can you attach the oc adm must-gathe ? must-gather log is attached.
hmm the must-gather is missing the machine-api namespace.. we should open a separate bug for cloud team for that.. Can you grab the logs from all the containers running in the openshift-machine-api namespace..
(In reply to Abhinav Dahiya from comment #5) > hmm the must-gather is missing the machine-api namespace.. > > we should open a separate bug for cloud team for that.. > > Can you grab the logs from all the containers running in the > openshift-machine-api namespace.. Collected all logs using must-gather, including machine-api namespace. (shared via cloud due to file size limitation.) https://drive.google.com/open?id=1I1dR65FFsvcMmgL-pBhtzu8kydd_Zgv5
*** This bug has been marked as a duplicate of bug 1815219 ***