Description of problem:
delete machine couldn't trigger requeue error
Version-Release number of selected component (if applicable):
$ ./openshift-install version
$ ./terraform version
$ oc version
features: Basic-Auth GSSAPI Kerberos SPNEGO
Steps to Reproduce:
1. $ oc delete machine qe-zhsun-1-worker-us-east-2a-2
2. $ oc logs -f clusterapi-manager-controllers-5b4996fb88-bzst7 -c machine-controller
$ oc delete machine qe-zhsun-1-worker-us-east-2a-2
machine.cluster.k8s.io "qe-zhsun-1-worker-us-east-2a-2" deleted
$ oc logs -f clusterapi-manager-controllers-5b4996fb88-bzst7 -c machine-controller
I1126 09:26:22.779527 1 controller.go:113] Running reconcile Machine for qe-zhsun-1-worker-us-east-2a-2
I1126 09:26:22.779663 1 controller.go:136] reconciling machine object qe-zhsun-1-worker-us-east-2a-2 triggers delete.
I1126 09:26:22.779759 1 actuator.go:454] deleting machine
I1126 09:26:22.915894 1 utils.go:165] Cleaning up extraneous instance for machine: i-037313b896d114c3b, state: running, launchTime: 2018-11-26 08:41:23 +0000 UTC
I1126 09:26:22.915926 1 utils.go:169] Terminating i-037313b896d114c3b instance
I1126 09:26:23.015070 1 controller.go:143] machine object qe-zhsun-1-worker-us-east-2a-2 deletion successful, removing finalizer.
delete machine trigger requeue error
are you saying the machine object is deleted before the aws instance is? Why would you expect the delete machine trigger requeue error? I don't see any error message in the logs saying the aws instance destruction failed.
Though, we have https://github.com/kubernetes-sigs/cluster-api/pull/598 that will re-queue a machine object in case the operation fails (for any reason).
Hi Jan Chaloupka,
Compared with creating machine, while instance status is pending, machine status is also pending and return requeue error. So for deleting machine, while the instance status is shutting-down, I think we should set machine status is shutting-down and return requeue error until instance status changes to terminated. If I'm wrong, please correct me.
> For deleting machine, while the instance status is shutting-down, I think we should set machine status is shutting-down and return requeue error until instance status changes to terminated.
Every aws instance that is in shutting-down state will get eventually deleted. Plus, based on the AWS documentation , instances that are not running are not charged.
Closing the bug as expected until we identify non-trivial long running processes that needs to be re-queued on deletion.