Bug 2087687

Summary: MCO does not generate event when user applies Default -> LowUpdateSlowReaction WorkerLatencyProfile
Product: OpenShift Container Platform Reporter: Harshal Patil <harpatil>
Component: NodeAssignee: Sai Ramesh Vanka <svanka>
Node sub component: Kubelet QA Contact: Weinan Liu <weinliu>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: weinliu
Version: 4.11Keywords: Reopened
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 11:12:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Harshal Patil 2022-05-18 09:00:25 UTC
MCO is supposed to generate an event to let user know that the such transition is prohibited. In the latest nightly, it's observed that the MCO, as expected, does not take any action when user applies Default -> LowUpdateSlowReaction WorkerLatencyProfile. However, MCO should also emit an event informing the user about this prohibited transition [1]. 


[1] https://github.com/openshift/machine-config-operator/pull/3129/files#diff-f37515e1ebe0812972d628b0f81dd29fc25fd8289fbef268b1391a8a3273e0beR212

Comment 1 Sai Ramesh Vanka 2022-05-19 13:15:26 UTC
The events are verified on the latest CI build as follows.

[svanka@svanka machine-config-operator]$ oc get events -A | grep Skipping
default                                            13m         Normal    ActionProhibited                                   node/cluster                                                                    Skipping the Update Node event, name: cluster, transition not allowed from old WorkerLatencyProfile: LowUpdateSlowReaction to new WorkerLatencyProfile: Default
default                                            7s          Normal    ActionProhibited                                   node/cluster                                                                    Skipping the Update Node event, name: cluster, transition not allowed from old WorkerLatencyProfile: Default to new WorkerLatencyProfile: LowUpdateSlowReaction

The events can also be viewed in the Events section while describing the nodes.config custom resource as follows.
[svanka@svanka ~]$ oc describe nodes.config cluster 
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  include.release.openshift.io/ibm-cloud-managed: true
              include.release.openshift.io/self-managed-high-availability: true
              include.release.openshift.io/single-node-developer: true
              release.openshift.io/create-only: true
API Version:  config.openshift.io/v1
Kind:         Node
Metadata:
  Creation Timestamp:  2022-05-19T11:20:20Z
  Generation:          4
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:include.release.openshift.io/ibm-cloud-managed:
          f:include.release.openshift.io/self-managed-high-availability:
          f:include.release.openshift.io/single-node-developer:
          f:release.openshift.io/create-only:
        f:ownerReferences:
          .:
          k:{"uid":"10fef1fa-bcb9-4c83-a7b5-9272b388eb9c"}:
      f:spec:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2022-05-19T11:20:20Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:workerLatencyProfile:
    Manager:    kubectl-edit
    Operation:  Update
    Time:       2022-05-19T11:44:19Z
  Owner References:
    API Version:     config.openshift.io/v1
    Kind:            ClusterVersion
    Name:            version
    UID:             10fef1fa-bcb9-4c83-a7b5-9272b388eb9c
  Resource Version:  45401
  UID:               42340ebe-65a1-4642-924c-e4c4959a2647
Spec:
  Worker Latency Profile:  LowUpdateSlowReaction
Events:
  Type    Reason            Age   From                                             Message
  ----    ------            ----  ----                                             -------
  Normal  ActionProhibited  66m   machineconfigcontroller-kubeletconfigcontroller  Skipping the Update Node event, name: cluster, transition not allowed from old WorkerLatencyProfile: LowUpdateSlowReaction to new WorkerLatencyProfile: Default
  Normal  ActionProhibited  53m   machineconfigcontroller-kubeletconfigcontroller  Skipping the Update Node event, name: cluster, transition not allowed from old WorkerLatencyProfile: Default to new WorkerLatencyProfile: LowUpdateSlowReaction

Comment 2 Sai Ramesh Vanka 2022-05-24 09:28:04 UTC
Following is the error observed sometimes during the update of the Node CR. Hence re-opened this bug to track the progress.

Could not construct reference to: '&v1.Node{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cluster", GenerateName:"", Namespace:"", SelfLink:"", UID:"5514379c-d76f-4310-82d7-026a411f38a6", ResourceVersion:"53495", Generation:2, CreationTimestamp:time.Date(2022, time.May, 24, 6, 31, 35, 0, time.Local), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"include.release.openshift.io/ibm-cloud-managed":"true", "include.release.openshift.io/self-managed-high-availability":"true", "include.release.openshift.io/single-node-developer":"true", "release.openshift.io/create-only":"true"}, OwnerReferences:[]v1.OwnerReference{v1.OwnerReference{APIVersion:"config.openshift.io/v1", Kind:"ClusterVersion", Name:"version", UID:"57bc48e0-8da1-4f8f-9e6e-24b5033fbebf", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil)}}, Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"cluster-version-operator", Operation:"Update", APIVersion:"config.openshift.io/v1", Time:time.Date(2022, time.May, 24, 6, 31, 35, 0, time.Local), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc000117308), Subresource:""}, v1.ManagedFieldsEntry{Manager:"kubectl-edit", Operation:"Update", APIVersion:"config.openshift.io/v1", Time:time.Date(2022, time.May, 24, 8, 7, 23, 0, time.Local), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc000117338), Subresource:""}}}, Spec:v1.NodeSpec{CgroupMode:"", WorkerLatencyProfile:"LowUpdateSlowReaction"}, Status:v1.NodeStatus{}}' due to: 'no kind is registered for the type v1.Node in scheme "github.com/openshift/machine-config-operator/pkg/generated/clientset/versioned/scheme/register.go:14"'. Will not report event: 'Normal' 'ActionProhibited' 'Skipping the Update Node event, name: cluster, transition not allowed from old WorkerLatencyProfile: LowUpdateSlowReaction to new WorkerLatencyProfile: Default'

Debugging the above issue.

Comment 6 errata-xmlrpc 2022-08-10 11:12:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069