Bug 1826811

Summary: [OCP 4.4][NMO] Accurately reflect the prior existence of a NMO CR for a given nodeName in the UI
Product: OpenShift Container Platform Reporter: gharden
Component: Node Maintenance OperatorAssignee: Jiri Tomasek <jtomasek>
Status: CLOSED ERRATA QA Contact: gharden
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: abeekhof, aos-bugs, gharden, jtomasek, mlammon, msluiter, scuppett, yjoseph
Target Milestone: ---Keywords: Triaged
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 15:58:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description gharden 2020-04-22 15:23:36 UTC
Description of problem:

Creating a duplicate NMO object for the same nodeName with different metatdata.name successfully creates a duplicate NMO for the same node.

metadata.name gets auto populated in the UI with a random name. If we want to prevent duplicate node maintenance for the same node we need to match on spec.nodeName not metadata.name. 


Version-Release number of selected component (if applicable):
4.4

How reproducible:
100%

Steps to Reproduce:
1. Create NMO yaml for master-0-0
2. Apply NMO yaml for master-0-0
2. Verify NMO was created
3. Create duplicate NMO yaml with same nodeName and different metadata.name
4. Apply duplicate NMO yaml

Actual results:
#### TEST1: Exact same host_maintenance_test.yaml
[root@sealusa9 ~]# cat host_maintenance_test.yaml 
apiVersion: kubevirt.io/v1alpha1
kind: NodeMaintenance
metadata:
  name: nodemaintenance-master-0-0
spec:
  nodeName: master-0-0
  reason: "Test node maintenance for master-0-0"
[root@sealusa9 ~]# oc create -f host_maintenance_test.yaml 
nodemaintenance.kubevirt.io/nodemaintenance-master-0-0 created
[root@sealusa9 ~]# oc get nodemaintenances.kubevirt.io 
NAME                         AGE
nodemaintenance-master-0-0   6s
[root@sealusa9 ~]# oc create -f host_maintenance_test.yaml 
Error from server (AlreadyExists): error when creating "host_maintenance_test.yaml": nodemaintenances.kubevirt.io "nodemaintenance-master-0-0" already exists
[root@sealusa9 ~]# 

# TEST2: Different metatdata.name
[root@sealusa9 ~]# cat host_maintenance_test.yaml 
apiVersion: kubevirt.io/v1alpha1
kind: NodeMaintenance
metadata:
  name: nodemaintenance-master-0-0
spec:
  nodeName: master-0-0
  reason: "Test node maintenance for master-0-0"
[root@sealusa9 ~]# cat host_maintenance_test_diff_metadata_name.yaml 
apiVersion: kubevirt.io/v1alpha1
kind: NodeMaintenance
metadata:
  name: nodemaintenance-master-0-0-diff-metatdata-name
spec:
  nodeName: master-0-0
  reason: "Test node maintenance for master-0-0 diff metadata.name"
[root@sealusa9 ~]# oc create -f host_maintenance_test.yaml 
nodemaintenance.kubevirt.io/nodemaintenance-master-0-0 created
[root@sealusa9 ~]# oc get nodemaintenances.kubevirt.io 
NAME                         AGE
nodemaintenance-master-0-0   11s
[root@sealusa9 ~]# oc create -f host_maintenance_test_diff_metadata_name.yaml 
nodemaintenance.kubevirt.io/nodemaintenance-master-0-0-diff-metatdata-name created
[root@sealusa9 ~]# oc get nodemaintenances.kubevirt.io 
NAME                                             AGE
nodemaintenance-master-0-0                       27s
nodemaintenance-master-0-0-diff-metatdata-name   9s
[root@sealusa9 ~]# 


Expected results:
[root@sealusa9 ~]# cat host_maintenance_test.yaml 
apiVersion: kubevirt.io/v1alpha1
kind: NodeMaintenance
metadata:
  name: nodemaintenance-master-0-0
spec:
  nodeName: master-0-0
  reason: "Test node maintenance for master-0-0"
[root@sealusa9 ~]# cat host_maintenance_test_diff_metadata_name.yaml 
apiVersion: kubevirt.io/v1alpha1
kind: NodeMaintenance
metadata:
  name: nodemaintenance-master-0-0-diff-metatdata-name
spec:
  nodeName: master-0-0
  reason: "Test node maintenance for master-0-0 diff metadata.name"
[root@sealusa9 ~]# oc create -f host_maintenance_test.yaml 
nodemaintenance.kubevirt.io/nodemaintenance-master-0-0 created
[root@sealusa9 ~]# oc get nodemaintenances.kubevirt.io 
NAME                         AGE
nodemaintenance-master-0-0   11s
[root@sealusa9 ~]# oc create -f host_maintenance_test_diff_metadata_name.yaml 
Error from server (AlreadyExists): error when creating "host_maintenance_test.yaml": nodemaintenances.kubevirt.io "nodemaintenance-master-0-0" already exists


Additional info:

Comment 1 Stephen Cuppett 2020-04-22 16:18:19 UTC
Setting target release to current development version (4.5) for investigation. Where fixes (if any) are required/requested for prior versions, cloned BZs will be created when appropriate.

Comment 3 Stephen Cuppett 2020-04-23 20:05:41 UTC
Setting target release to current development version (4.5) for investigation. Where fixes (if any) are required/requested for prior versions, cloned BZs will be created when appropriate.

Comment 5 Andrew Beekhof 2020-07-01 00:07:16 UTC
With bug #1826457, metadata.name will include spec.nodeName but also a random suffix in 4.6.
Bug #1825997 (also targeted for 4.6) will prevent the creation of duplicates, but we need to make sure the UI accurately reflects the presence of an existing CR (even if the node has not reached the target state).

Jiri: How does the UI decide if the button should allow the admin to enter or leave maintenance mode?

If that is sorted, we can probably close this one.

Comment 8 Jiri Tomasek 2020-10-01 10:37:32 UTC
To find that a host/node is in maintenance the UI searches for maintenance CR with matching 'spec.nodeName'.

Comment 9 Andrew Beekhof 2020-10-01 23:58:18 UTC
(In reply to Jiri Tomasek from comment #8)
> To find that a host/node is in maintenance the UI searches for maintenance
> CR with matching 'spec.nodeName'.

Ah, a little ready provided the additional context.

Is spec.nodeName being used in 4.5 or does this need to go through QE for 4.6/7?

Comment 11 gharden 2020-10-02 18:50:11 UTC
[root@sealusa8 ~]# cat worker00_host_maintenance_test.yaml apiVersion: nodemaintenance.kubevirt.io/v1beta1
kind: NodeMaintenance
metadata:
  name: worker-0-0
spec:
  nodeName: worker-0-0
  reason: "Test node maintenance for worker-0-0"
[root@sealusa8 ~]# cat worker00_host_maintenance_test_diff_metatdata_name.yaml 
apiVersion: nodemaintenance.kubevirt.io/v1beta1
kind: NodeMaintenance
metadata:
  name: worker-0-0-diff
spec:
  nodeName: worker-0-0
  reason: "Test node maintenance for worker-0-0"
[root@sealusa8 ~]# oc create -f worker00_host_maintenance_test.yaml 
nodemaintenance.nodemaintenance.kubevirt.io/worker-0-0 created
[root@sealusa8 ~]# oc create -f worker00_host_maintenance_test_diff_metatdata_name.yaml 
Error from server (invalid nodeName, a NodeMaintenance for node worker-0-0 already exists): error when creating "worker00_host_maintenance_test_diff_metatdata_name.yaml": admission webhook "nodemaintenance-validation.kubevirt.io" denied the request: invalid nodeName, a NodeMaintenance for node worker-0-0 already exists
[root@sealusa8 ~]# oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-29-162625   True        False         22h     Cluster version is 4.6.0-0.nightly-2020-09-29-162625
[root@sealusa8 ~]#

Comment 14 errata-xmlrpc 2020-10-27 15:58:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196