Bug 2042366

Summary: Lifecycle hooks should be independently managed
Product: OpenShift Container Platform Reporter: Joel Speed <jspeed>
Component: Cloud ComputeAssignee: Joel Speed <jspeed>
Cloud Compute sub component: Other Providers QA Contact: Milind Yadav <miyadav>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high    
Version: 4.10   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:40:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Joel Speed 2022-01-19 10:49:59 UTC
Description of problem:

Currently, the lifecycle hook maps are considered atomic, if a controller is managing their own lifecycle hook, they may accidentally overwrite another controllers hook.

Server side apply account for this by allowing to specify the list as a map.

Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce:
1. Create two machine yamls for the same machine with different hooks, eg:
apiVersion: machine.openshift.io/v1beta1

kind: Machine
metadata:
  name: list-map-machine
spec:
  lifecycleHooks:
    preDrain:
    - name: hook1
      owner: hookOwner1

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  name: list-map-machine
spec:
  lifecycleHooks:
    preDrain:
    - name: hook2
      owner: hookOwner2

2. Apply the file first with one client: kubectl apply --server-side --field-manager client1 -f machine-hook1.yaml
3. Then the second using a second client: kubectl apply --server-side --field-manager client2 -f machine-hook2.yaml

Actual results:
An error occurs because the list is atomic, you need to force apply the second file which overwrites the first hook

Expected results:
Both hooks should be accepted and be listed with separate field managers

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  creationTimestamp: "2022-01-19T10:48:47Z"
  generation: 2
  managedFields:
  - apiVersion: machine.openshift.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:lifecycleHooks:
          f:preDrain:
            k:{"name":"hook1"}:
              .: {}
              f:name: {}
              f:owner: {}
    manager: client1
    operation: Apply
    time: "2022-01-19T10:48:47Z"
  - apiVersion: machine.openshift.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:lifecycleHooks:
          f:preDrain:
            k:{"name":"hook2"}:
              .: {}
              f:name: {}
              f:owner: {}
    manager: client2
    operation: Apply
    time: "2022-01-19T10:49:05Z"
  name: list-map-machine
  namespace: default
  resourceVersion: "2297"
  uid: ba6012e1-10b8-426f-8ed0-9495282fb3eb
spec:
  lifecycleHooks:
    preDrain:
    - name: hook1
      owner: hookOwner1
    - name: hook2
      owner: hookOwner2

Additional info:

Comment 1 Joel Speed 2022-01-19 13:22:01 UTC
We need to update the MAO repo as well

Comment 4 Milind Yadav 2022-01-24 05:16:40 UTC
Validated on -
[miyadav@miyadav aws]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-22-102609   True        False         57m     Cluster version is 4.10.0-0.nightly-2022-01-22-102609


Steps :
1. Edited the machine in the machineset to have more than one hooks 

Expected and Actual : 

Both hooks were added and shown successfully on the machine

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  annotations:
    machine.openshift.io/instance-state: running
  creationTimestamp: "2022-01-24T04:35:14Z"
  finalizers:
  - machine.machine.openshift.io
  generateName: miyadav2401-newbz-
  generation: 4
  labels:
    machine.openshift.io/cluster-api-cluster: miyadav2401-fzgpc
    machine.openshift.io/cluster-api-machine-role: worker
    machine.openshift.io/cluster-api-machine-type: worker
    machine.openshift.io/cluster-api-machineset: miyadav2401-newbz
    machine.openshift.io/instance-type: m6i.large
    machine.openshift.io/region: us-east-2
    machine.openshift.io/zone: us-east-2a
  name: miyadav2401-newbz-4xv99
  namespace: openshift-machine-api
  ownerReferences:
  - apiVersion: machine.openshift.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: MachineSet
    name: miyadav2401-newbz
    uid: b29445df-2252-48aa-83d4-228bc949e3cd
  resourceVersion: "47075"
  uid: e05a53bb-f727-4232-98bf-1fb412a46902
spec:
  lifecycleHooks:
    preDrain:
    - name: hook1
      owner: hookOwner1
    - name: hook2
      owner: hookowner2
.
.
.
.

Additional Info :

This is covered by one of the steps in test - https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-47230

Comment 7 errata-xmlrpc 2022-03-10 16:40:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056