Created attachment 1527756 [details] Openshift destroy logs Description of problem: Openshift installer create cluster command fails saying terraform.tfstate is present although, Open shift destroy cluster was run prior to it. The openshift destroy command is not cleaning up the terraform tfstate file if we have a failed installation. Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. Create a role with same name as clustername you are planning to provide for install 2. Create a cluster using openshift installer, the installation will fail since the it will complain saying same role exist 3. Since installer has partially created resources, try destroying the cluster. 4. the destroy will not remove terraform.tfstate file. 5) try running openshift delete cluster, but still the file terraform.tfstate doesnt get removed 6) Try creating a new cluster it will complain "FATAL failed to fetch Cluster: failed to load asset "Cluster": "terraform.tfstate" already exists. There may already be a running cluster" Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated ./openshift-install create cluster --log-level=debug DEBUG Fetching "Terraform Variables"... DEBUG Loading "Terraform Variables"... DEBUG Loading "Cluster ID"... DEBUG Using "Cluster ID" loaded from state file DEBUG Loading "Install Config"... DEBUG Loading "SSH Key"... DEBUG Using "SSH Key" loaded from state file DEBUG Loading "Base Domain"... DEBUG Loading "Platform"... DEBUG Using "Platform" loaded from state file DEBUG Using "Base Domain" loaded from state file DEBUG Loading "Cluster Name"... DEBUG Using "Cluster Name" loaded from state file DEBUG Loading "Pull Secret"... DEBUG Using "Pull Secret" loaded from state file DEBUG Loading "Platform"... DEBUG Using "Install Config" loaded from state file DEBUG Loading "Image"... DEBUG Loading "Install Config"... DEBUG Using "Image" loaded from state file DEBUG Loading "Bootstrap Ignition Config"... DEBUG Loading "Install Config"... DEBUG Loading "Root CA"... DEBUG Using "Root CA" loaded from state file DEBUG Loading "Certificate (etcd)"... DEBUG Loading "Root CA"... DEBUG Using "Certificate (etcd)" loaded from state file DEBUG Loading "Certificate (kube-ca)"... DEBUG Loading "Root CA"... DEBUG Using "Certificate (kube-ca)" loaded from state file DEBUG Loading "Certificate (aggregator)"... DEBUG Loading "Root CA"... DEBUG Using "Certificate (aggregator)" loaded from state file DEBUG Loading "Certificate (service-serving)"... DEBUG Loading "Root CA"... DEBUG Using "Certificate (service-serving)" loaded from state file DEBUG Loading "Certificate (etcd)"... DEBUG Loading "Certificate (etcd)"... DEBUG Using "Certificate (etcd)" loaded from state file DEBUG Loading "Certificate (kube-apiaserver)"... DEBUG Loading "Certificate (kube-ca)"... DEBUG Loading "Install Config"... DEBUG Using "Certificate (kube-apiaserver)" loaded from state file DEBUG Loading "Certificate (system:kube-apiserver-proxy)"... DEBUG Loading "Certificate (aggregator)"... DEBUG Using "Certificate (system:kube-apiserver-proxy)" loaded from state file DEBUG Loading "Certificate (system:admin)"... DEBUG Loading "Certificate (kube-ca)"... DEBUG Using "Certificate (system:admin)" loaded from state file DEBUG Loading "Certificate (system:serviceaccount:kube-system:default)"... DEBUG Loading "Certificate (kube-ca)"... DEBUG Using "Certificate (system:serviceaccount:kube-system:default)" loaded from state file DEBUG Loading "Certificate (mcs)"... DEBUG Loading "Root CA"... DEBUG Loading "Install Config"... DEBUG Using "Certificate (mcs)" loaded from state file DEBUG Loading "Key Pair (service-account.pub)"... DEBUG Using "Key Pair (service-account.pub)" loaded from state file DEBUG Loading "Certificate (journal-gatewayd)"... DEBUG Loading "Root CA"... DEBUG Loading "Kubeconfig Admin"... DEBUG Loading "Root CA"... DEBUG Loading "Certificate (system:admin)"... DEBUG Loading "Install Config"... DEBUG Loading "Kubeconfig Kubelet"... DEBUG Loading "Root CA"... DEBUG Loading "Certificate (system:serviceaccount:kube-system:default)"... DEBUG Loading "Install Config"... DEBUG Using "Kubeconfig Kubelet" loaded from state file DEBUG Loading "Common Manifests"... DEBUG Loading "Cluster ID"... DEBUG Loading "Install Config"... DEBUG Loading "Ingress Config"... DEBUG Loading "Install Config"... DEBUG Using "Ingress Config" loaded from state file DEBUG Loading "DNS Config"... DEBUG Loading "Install Config"... DEBUG Using "DNS Config" loaded from state file DEBUG Loading "Infrastructure Config"... DEBUG Loading "Install Config"... DEBUG Loading "Infrastructure"... DEBUG Using "Infrastructure" loaded from state file DEBUG Using "Infrastructure Config" loaded from state file DEBUG Loading "Network Config"... DEBUG Loading "Install Config"... DEBUG Using "Network Config" loaded from state file DEBUG Loading "Root CA"... DEBUG Loading "Certificate (etcd)"... DEBUG Loading "Certificate (ingress)"... DEBUG Loading "Certificate (kube-ca)"... DEBUG Loading "Install Config"... DEBUG Using "Certificate (ingress)" loaded from state file DEBUG Loading "Certificate (kube-ca)"... DEBUG Loading "Certificate (service-serving)"... DEBUG Loading "Certificate (etcd)"... DEBUG Loading "Certificate (mcs)"... DEBUG Loading "Certificate (system:serviceaccount:kube-system:default)"... DEBUG Loading "KubeCloudConfig"... DEBUG Using "KubeCloudConfig" loaded from state file DEBUG Loading "MachineConfigServerTLSSecret"... DEBUG Using "MachineConfigServerTLSSecret" loaded from state file DEBUG Loading "OpenshiftServiceCertSignerSecret"... DEBUG Using "OpenshiftServiceCertSignerSecret" loaded from state file DEBUG Loading "Pull"... DEBUG Using "Pull" loaded from state file DEBUG Loading "CVOOverrides"... DEBUG Using "CVOOverrides" loaded from state file DEBUG Loading "HostEtcdServiceEndpointsKubeSystem"... DEBUG Using "HostEtcdServiceEndpointsKubeSystem" loaded from state file DEBUG Loading "KubeSystemConfigmapEtcdServingCA"... DEBUG Using "KubeSystemConfigmapEtcdServingCA" loaded from state file DEBUG Loading "KubeSystemConfigmapRootCA"... DEBUG Using "KubeSystemConfigmapRootCA" loaded from state file DEBUG Loading "KubeSystemSecretEtcdClient"... DEBUG Using "KubeSystemSecretEtcdClient" loaded from state file DEBUG Loading "OpenshiftMachineConfigOperator"... DEBUG Using "OpenshiftMachineConfigOperator" loaded from state file DEBUG Loading "OpenshiftClusterAPINamespace"... DEBUG Using "OpenshiftClusterAPINamespace" loaded from state file DEBUG Loading "OpenshiftServiceCertSignerNamespace"... DEBUG Using "OpenshiftServiceCertSignerNamespace" loaded from state file DEBUG Loading "EtcdServiceKubeSystem"... DEBUG Using "EtcdServiceKubeSystem" loaded from state file DEBUG Loading "HostEtcdServiceKubeSystem"... DEBUG Using "HostEtcdServiceKubeSystem" loaded from state file DEBUG Using "Common Manifests" loaded from state file DEBUG Loading "Openshift Manifests"... DEBUG Loading "Install Config"... DEBUG Loading "Cluster.cluster.k8s.io/v1alpha1"... DEBUG Loading "Install Config"... DEBUG Loading "Network Config"... DEBUG Using "Cluster.cluster.k8s.io/v1alpha1" loaded from state file DEBUG Loading "Worker Machines"... DEBUG Loading "Cluster ID"... DEBUG Loading "Install Config"... DEBUG Loading "Image"... DEBUG Loading "Worker Ignition Config"... DEBUG Loading "Install Config"... DEBUG Loading "Root CA"... DEBUG Using "Worker Ignition Config" loaded from state file DEBUG Using "Worker Machines" loaded from state file DEBUG Loading "Master Machines"... DEBUG Loading "Cluster ID"... DEBUG Loading "Install Config"... DEBUG Loading "Image"... DEBUG Loading "Master Ignition Config"... DEBUG Loading "Install Config"... DEBUG Loading "Root CA"... DEBUG Using "Master Ignition Config" loaded from state file DEBUG Using "Master Machines" loaded from state file DEBUG Loading "Kubeadmin Password"... DEBUG Using "Kubeadmin Password" loaded from state file DEBUG Loading "BindingDiscovery"... DEBUG Using "BindingDiscovery" loaded from state file DEBUG Loading "CloudCredsSecret"... DEBUG Using "CloudCredsSecret" loaded from state file DEBUG Loading "KubeadminPasswordSecret"... DEBUG Using "KubeadminPasswordSecret" loaded from state file DEBUG Loading "RoleCloudCredsSecretReader"... DEBUG Using "RoleCloudCredsSecretReader" loaded from state file DEBUG Using "Openshift Manifests" loaded from state file DEBUG Using "Bootstrap Ignition Config" loaded from state file DEBUG Loading "Master Ignition Config"... DEBUG Fetching "Cluster ID"... DEBUG Reusing previously-fetched "Cluster ID" DEBUG Fetching "Install Config"... DEBUG Reusing previously-fetched "Install Config" DEBUG Fetching "Image"... DEBUG Reusing previously-fetched "Image" DEBUG Fetching "Bootstrap Ignition Config"... DEBUG Reusing previously-fetched "Bootstrap Ignition Config" DEBUG Fetching "Master Ignition Config"... DEBUG Reusing previously-fetched "Master Ignition Config" DEBUG Generating "Terraform Variables"... DEBUG Fetching "Kubeconfig Admin"... DEBUG Fetching "Root CA"... DEBUG Reusing previously-fetched "Root CA" DEBUG Fetching "Certificate (system:admin)"... DEBUG Reusing previously-fetched "Certificate (system:admin)" DEBUG Fetching "Install Config"... DEBUG Reusing previously-fetched "Install Config" DEBUG Generating "Kubeconfig Admin"... DEBUG Fetching "Certificate (journal-gatewayd)"... DEBUG Fetching "Root CA"... DEBUG Reusing previously-fetched "Root CA" DEBUG Generating "Certificate (journal-gatewayd)"... DEBUG Fetching "Cluster"... DEBUG Loading "Cluster"... DEBUG Loading "Cluster ID"... DEBUG Loading "Install Config"... DEBUG Loading "Terraform Variables"... DEBUG Loading "Kubeadmin Password"... FATAL failed to fetch Cluster: failed to load asset "Cluster": "terraform.tfstate" already exists. There may already be a running cluster Expected results: openshift destroy command should have deleted all the resources and removed the terraform.tfstate file and should have allowed for creation of new cluster Additional info: Please attach logs from ansible-playbook with the -vvv flag
The IAM role for instance if my clustername is test, create a IAM role test-master-role and try installation , this is bound to fail after that try destroying the cluster the terraform.tfstate file never gets deleted
We leave the Terraform state around in order to help with debugging. We expect every cluster to be installed from a new asset directory and we've added a note [1] to our docs. Do you disagree with this behavior? [1]: https://github.com/openshift/installer#cleanup
Hi Alex, I agree, that every installation should be carried out from a new assets directory, However here I am trying to reuse a directory which had a failed installation , so even after running "destroy cluster" on the directory terrform files are not cleared leaving the directory unusable causing future installs to fail. Just want know why destroy cluster didn't clean up the terraform file, Please let me know if any other information is required. Thanks, Dixit
Since [1], the installer has been more aggressive about clearing the asset directory during a successful 'destroy cluster'. Although I'd have expected a successful 'destroy cluster' to remove the Terraform state since [2]. What version installer are you using? [1]: http://github.com/openshift/installer/pull/1086 v0.10.1 [2]: https://github.com/openshift/installer/pull/547 v0.4.0
Using 0.13 installer after hitting a limit an elastic IP limit ERROR *** module.vpc.aws_eip.nat_eip[0]: 1 error occurred: ******************************************************************************************************ERROR *** aws_eip.nat_eip.0: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached. ****************************************ERROR **status code: 400, request id: b754746f-e87b-4633-a3e6-4d0fbc92bb9f **************************************************************************************ERROR I ran into tfstate problems ```FATAL failed to fetch Cluster: failed to load asset "Cluster": "terraform.tfstate" already exists. There may already be a running cluster **********************``` Log file attached.
Created attachment 1539318 [details] openshift 0.13 installer log with tfstate left
William, this is a separate issue. Your cluster failed to install because your AWS account doesn't have the resources available. Your subsequent installation failed immediately with the terraform.tfstate error because your first attempt failed halfway through. You need to destroy that first cluster, increase your resource limits, and then try recreating the cluster.
I'm closing the original issue due to inactivity.
FATAL failed to fetch Cluster: failed to load asset "Cluster": "terraform.tfstate" already exists. There may already be a running cluster I am facing the same issue, please let me know solution for this one
I'm also facing this issue after having "destroyed" a cluster after a failed deployment
For anyone who also faces this, I was able to work around the issue quite easily by manually deleting "terraform.tfstate" from my deployment directory. As far as I can tell, that's all that was left behind. Why it's not cleaned up? I'm not sure. Deployments are working again.
>We expect every cluster to be installed from a new asset directory and we've added a note [1] to our docs. If this is the expectation, could we encode the installer to enforce this? Also, I noticed that you added a note to the github documentation - is this apparent in product documentation or the destroy command help output?
The installer will automatically remove the terraform.tfstate file once it has destroyed the cluster. If the cluster destruction is interrupted or otherwise fails, the state file will be left behind. It is safe to attempt the destruction again if you encounter this. If you are still seeing a failure, please open a new issue with more detail (e.g. the installer version).
> The installer will automatically remove the terraform.tfstate file once it has destroyed the cluster. Not in all cases, but there's an existing bug 1791400 with an open installer PR for that.