Description of problem: Node became not ready and VMs did not fail over to other healthy nodes It looks like the customer did not have any cluster HA enabled, and the VMs did not restart. Per Peter Lauterbach: Even though this is not the default behavior, we should warn the admin that the cluster is not set up correctly. Please open a DEFECT. This is NOT an RFE, but bad product behavior.
Case 1: VM doesn't support live migration, node is down , no mechanism to make node up exist. We see below alerts : VMCannotBeEvicted Eviction policy for fedora-dcl8kcbo31ttq3lf (on node c01-gkbug412-dbgc7-worker-0-qbr6s) is set to Live Migration but the VM is not migratable KubeNodeUnreachable c01-gkbug412-dbgc7-worker-0-qbr6s is unreachable and some workloads may be rescheduled. KubeNodeNotReady c01-gkbug412-dbgc7-worker-0-qbr6s has been unready for more than 15 minutes. Case 2: VM support live migration, node is down , no mechanism to make node up exist. set sc to ocs-storagecluster-ceph-rbd $ oc get pvc -A| grep t5dwa7ddhh02vall testing fedora-t5dwa7ddhh02vall Bound pvc-3d2f5c1d-d43d-42cf-a96b-a00af7a12a09 30Gi RWX ocs-storagecluster-ceph-rbd 16m $ oc get vmi -A NAMESPACE NAME AGE PHASE IP NODENAME READY testing fedora-t5dwa7ddhh02vall 7m4s Running 10.129.2.144 c01-gkbug412-dbgc7-worker-0-qbr6s True VM is running without VMEvictionalerts. Shutdown the node, VM goes to Ready=False $ oc get vmi -A NAMESPACE NAME AGE PHASE IP NODENAME READY testing fedora-t5dwa7ddhh02vall 9m4s Running 10.129.2.144 c01-gkbug412-dbgc7-worker-0-qbr6s False Monitor VM if it gets started at different node $ oc get vmi -A NAMESPACE NAME AGE PHASE IP NODENAME READY testing fedora-t5dwa7ddhh02vall 13m Running 10.129.2.144 c01-gkbug412-dbgc7-worker-0-qbr6s False Note: Its been 4 mins since VM is down, No alerts have been seen so far. This is a problem. Alerts from network side and storage side are getting populated as node is not up. so alert is missing in case VM can migrate but is down due to some reason . As a user i might want to see alert if my VM is not up.