A machine that was stopped (via AWS / machine shutdown) shows "Running" in the console but is actually "stopped". The API shows the following: phase: Running providerStatus: apiVersion: awsproviderconfig.openshift.io/v1beta1 conditions: - lastProbeTime: '2020-01-17T16:51:07Z' lastTransitionTime: '2020-01-17T16:51:07Z' message: machine successfully created reason: MachineCreationSucceeded status: 'True' type: MachineCreation instanceId: i-0f8dac58189b2cfab instanceState: stopped kind: AWSMachineProviderStatus The console should show "Stopped" in all of the places, the machine is definitely not running. High because this is an unusual state and we actively tell the user the wrong thing.
Setting target release to the active development branch (4.4). For fixes, if any, which require backport to prior versions, clones of this BZ will be created.
instanceState is arbitrary provider specific information, whatever arbitrary states each cloud provider choose to convey its life cycle, there's no convention there, it's an implementation detail. Phase is provider agnostic OCP machine API semantic for machine life cycle following this criteria https://github.com/openshift/enhancements/blob/master/enhancements/machine-api/machine-instance-lifecycle.md, i.e -Provisioning (attempting to create an instance for the machine) -Provisioned (machine was given IPs/providerID) -Running (machine was ever given a node) -Deleting (machine has a deletion timestamp) -Failed (the cloud instance for a machine is gone) I think making phases visible to the user in the console is a good thing because it gives consistent view of the world for OCP machines lifecycle no matter the provider, it has concise meaning. A combination such as phase: running and providerState: stopped is valid, if the issue is this is counterintuitive: may be could consider renaming "running" to something else?
As discussed with Brad.ison the console could check the phase values communicate the Machine phases in more human friendly terms: "Running" >> "Machine provisioned as Node" or something like that, and maybe additionally shows the raw provider state as "Cloud Provider State".
Tested on OCP 4.4 env with payload: 4.4.0-0.nightly-2020-02-12-191550. Now there are "Phase" and "Provider State" in machines list lable. When the instance is running, "Phase" and "Provider State" are "Provisioned as node" and "running". When the instance is stopped, "Phase" and "Provider State" are "Provisioned as node" and "stopped". The bug is fixed, so move it to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581