Bug 1792487

Summary: Stopped machine still shows as "running" in console [openshift-4.4]
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: Management ConsoleAssignee: Rastislav Wagner <rawagner>
Status: CLOSED ERRATA QA Contact: Yadan Pei <yapei>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: agarcial, aos-bugs, bpeterse, jokerman, scuppett, yanpzhan
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Added new column 'Provider State' for Machine to better represent the actual state. Reason: When machine was stopped, the state column kept showing Running which was confusing. The issue was that state column actually represented machine phase. After this enhancement we show two columns - Phase - showing the Machine phase and Provider state Result: When a machine is stopped user can see the state in Provider state column
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-04 11:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2020-01-17 17:25:30 UTC
A machine that was stopped (via AWS / machine shutdown) shows "Running" in the console but is actually "stopped".  The API shows the following:

  phase: Running
  providerStatus:
    apiVersion: awsproviderconfig.openshift.io/v1beta1
    conditions:
      - lastProbeTime: '2020-01-17T16:51:07Z'
        lastTransitionTime: '2020-01-17T16:51:07Z'
        message: machine successfully created
        reason: MachineCreationSucceeded
        status: 'True'
        type: MachineCreation
    instanceId: i-0f8dac58189b2cfab
    instanceState: stopped
    kind: AWSMachineProviderStatus

The console should show "Stopped" in all of the places, the machine is definitely not running.

High because this is an unusual state and we actively tell the user the wrong thing.

Comment 1 Stephen Cuppett 2020-01-17 18:07:53 UTC
Setting target release to the active development branch (4.4). For fixes, if any, which require backport to prior versions, clones of this BZ will be created.

Comment 2 Alberto 2020-01-31 16:04:37 UTC
instanceState is arbitrary provider specific information, whatever arbitrary states each cloud provider choose to convey its life cycle, there's no convention there, it's an implementation detail.

Phase is provider agnostic OCP machine API semantic for machine life cycle following this criteria https://github.com/openshift/enhancements/blob/master/enhancements/machine-api/machine-instance-lifecycle.md, i.e
-Provisioning (attempting to create an instance for the machine)
-Provisioned (machine was given IPs/providerID)
-Running (machine was ever given a node)
-Deleting (machine has a deletion timestamp)
-Failed (the cloud instance for a machine is gone)

I think making phases visible to the user in the console is a good thing because it gives consistent view of the world for OCP machines lifecycle no matter the provider, it has concise meaning.
A combination such as phase: running and providerState: stopped is valid, if the issue is this is counterintuitive:
may be could consider renaming "running" to something else?

Comment 3 Alberto 2020-02-06 15:17:27 UTC
As discussed with Brad.ison the console could check the phase values communicate the Machine phases in more human friendly terms: "Running" >> "Machine provisioned as Node" or something like that, and maybe additionally shows the raw provider state as "Cloud Provider State".

Comment 5 Yanping Zhang 2020-02-13 09:19:14 UTC
Tested on OCP 4.4 env with payload: 4.4.0-0.nightly-2020-02-12-191550.
Now there are "Phase" and "Provider State" in machines list lable. When the instance is running, "Phase" and "Provider State" are "Provisioned as node" and "running". When the instance is stopped, "Phase" and "Provider State" are "Provisioned as node" and "stopped".
The bug is fixed, so move it to verified.

Comment 7 errata-xmlrpc 2020-05-04 11:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581