Bug 2108014 - Nutanix: the e2e-nutanix-operator webhooks test suite does not support provider Nutanix
Summary: Nutanix: the e2e-nutanix-operator webhooks test suite does not support provid...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.11.z
Assignee: Andy Daniel
QA Contact: Milind Yadav
URL:
Whiteboard:
Depends On: 2106403
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-18 09:46 UTC by OpenShift BugZilla Robot
Modified: 2022-09-07 20:49 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-07 20:49:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-actuator-pkg pull 237 0 None open [release-4.11] Bug 2108014: Add Nutanix as the supported provider platform for e2e-nutanix-operator test 2022-07-18 09:47:30 UTC
Red Hat Product Errata RHSA-2022:6287 0 None None None 2022-09-07 20:49:54 UTC

Description OpenShift BugZilla Robot 2022-07-18 09:46:43 UTC
+++ This bug was initially created as a clone of Bug #2106403 +++

Description of problem:
The e2e-nutanix-operator webhooks test suite does not support provider Nutanix

Version-Release number of selected component (if applicable):
4.12

How reproducible:
Always

Steps to Reproduce:
1. Create an OCP cluster with Nutanix platform, using the latest 4.12 nightly release image
2. Clone the repo github.com/openshift/cluster-api-actuator-pkg
3. Set the KUBECONFIG to the OCP cluster created in step 1
4. Run the command: 
  NAMESPACE=kube-system ./hack/ci-integration.sh -focus "Webhooks" -v

Actual results:
The tests failed and the the default machineset of the OCP cluster got deleted.

Expected results:
All the tests in the suite pass.

Additional info:

--- Additional comment from mimccune on 2022-07-12 16:05:48 UTC ---

@yanhli does this fail even with the change from Sid? (https://github.com/openshift/machine-api-operator/pull/1034)

--- Additional comment from sishukla on 2022-07-13 13:40:37 UTC ---

@mimccune The failure is actually due to the setup clause of the webhook spec in the cluster-api-pkg-actuator test skipping out early without setting the labels appropriately and the cleanup phase using unset labels which match all and mark all machinesets and machines for deletion.

--- Additional comment from sishukla on 2022-07-13 13:42:06 UTC ---

https://github.com/openshift/cluster-api-actuator-pkg/blob/master/pkg/infra/webhooks.go#L37-L66

--- Additional comment from sishukla on 2022-07-13 13:43:23 UTC ---

Two things:
- that conditional should be in the very beginning of the spec description and not in the setup clause.
- we need to add nutanix platform to that conditional

--- Additional comment from mimccune on 2022-07-13 13:45:52 UTC ---

(In reply to Sid Shukla from comment #4)
> Two things:
> - that conditional should be in the very beginning of the spec description
> and not in the setup clause.

ah yeah, so it adds the skip instead of failing on non-supported platforms?

--- Additional comment from sishukla on 2022-07-13 15:17:22 UTC ---

So, here's an example:
```go
package ginkgo_playground_test

import (
	"fmt"
	"testing"

	. "github.com/onsi/ginkgo/v2"
	. "github.com/onsi/gomega"
)

func TestGinkgoPlayground(t *testing.T) {
	RegisterFailHandler(Fail)
	RunSpecs(t, "GinkgoPlayground Suite")
}

var _ = Describe("Printing execution order of closures", func() {
		fmt.Println("describe block")
		BeforeEach(func() {
			fmt.Println("before each block")
			Skip("skip")
		})

		AfterEach(func() {
			fmt.Println("after each block")
		})

		It("executes the It block", func() {
			fmt.Println("it block")
		})

	})
```

When you run ginkgo test on this, here's the output
```
$ ginkgo run .
describe block
Running Suite: GinkgoPlayground Suite - /Users/sid.shukla/go/src/github.com/thunderboltsid/ginkgo-playground
============================================================================================================
Random Seed: 1657725350

Will run 1 of 1 specs
before each block
after each block
------------------------------
S [SKIPPED] [0.000 seconds]
Printing execution order of closures [BeforeEach]
/Users/sid.shukla/go/src/github.com/thunderboltsid/ginkgo-playground/ginkgo_playground_suite_test.go:18
  executes the It block
  /Users/sid.shukla/go/src/github.com/thunderboltsid/ginkgo-playground/ginkgo_playground_suite_test.go:27

  skip
  In [BeforeEach] at: /Users/sid.shukla/go/src/github.com/thunderboltsid/ginkgo-playground/ginkgo_playground_suite_test.go:20
------------------------------

Ran 0 of 1 Specs in 0.001 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 1 Skipped
PASS

Ginkgo ran 1 suite in 2.339198756s
Test Suite Passed
```

As you can see, the AfterEach closure and the BeforeEach closure get executed if the skip happens inside the BeforeEach closure.

--- Additional comment from sishukla on 2022-07-13 15:22:45 UTC ---

If the skip clause is moved over to the Describe closure, the BeforeEach and AfterEach closures are not executed. 
```go
package ginkgo_playground_test

import (
	"fmt"
	"testing"

	. "github.com/onsi/ginkgo/v2"
	. "github.com/onsi/gomega"
)

func TestGinkgoPlayground(t *testing.T) {
	RegisterFailHandler(Fail)
	RunSpecs(t, "GinkgoPlayground Suite")
}

var _ = Describe("Printing execution order of closures", func() {
		fmt.Println("describe block")
		defer GinkgoRecover()
		Skip("skip")
		BeforeEach(func() {
			fmt.Println("before each block")
		})

		AfterEach(func() {
			fmt.Println("after each block")
		})

		It("executes the It block", func() {
			fmt.Println("it block")
		})

	})
```
as can be seen from running this
```
$ ginkgo run .
describe block
Running Suite: GinkgoPlayground Suite - /Users/sid.shukla/go/src/github.com/thunderboltsid/ginkgo-playground
============================================================================================================
Random Seed: 1657725679

Will run 0 of 0 specs

Ran 0 of 0 Specs in 0.000 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS

Ginkgo ran 1 suite in 2.262312576s
Test Suite Passed
```

--- Additional comment from sishukla on 2022-07-13 15:26:41 UTC ---

What that entails for the Webhook spec is if a platform is not explicitly in this switch (https://github.com/openshift/cluster-api-actuator-pkg/blob/master/pkg/infra/webhooks.go#L39), the testSelector (https://github.com/openshift/cluster-api-actuator-pkg/blob/master/pkg/infra/webhooks.go#L51-L53) never gets initialized. As a result, when the `AfterEach` closure runs, it ends up marking all machines and machinesets for deletion (https://github.com/openshift/cluster-api-actuator-pkg/blob/master/pkg/infra/webhooks.go#L57-L65).

--- Additional comment from mimccune on 2022-07-13 16:14:25 UTC ---

great analysis Sid, it makes sense to me. would you like to propose a patch for this? (otherwise i can make something from your samples here)

--- Additional comment from yanhli on 2022-07-13 18:57:22 UTC ---

I filed the PR https://github.com/openshift/cluster-api-actuator-pkg/pull/236. And manually tested.

@mimccune Please review the fix at https://github.com/openshift/cluster-api-actuator-pkg/pull/236.

--- Additional comment from mimccune on 2022-07-13 19:23:36 UTC ---

awesome, thank you Yanhua!

Comment 2 Milind Yadav 2022-08-29 05:39:34 UTC

Validated as below - 

[miyadav@miyadav ~]$ vi ~/.kube/config
[miyadav@miyadav ~]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-08-26-162248   True        False         30m     Cluster version is 4.11.0-0.nightly-2022-08-26-162248
[miyadav@miyadav ~]$ git clone github.com/openshift/cluster-api-actuator-pkg
fatal: repository 'github.com/openshift/cluster-api-actuator-pkg' does not exist
[miyadav@miyadav ~]$ df -h
Filesystem                Size  Used Avail Use% Mounted on
devtmpfs                  7.6G     0  7.6G   0% /dev
tmpfs                     7.7G  230M  7.4G   3% /dev/shm
tmpfs                     7.7G  2.0M  7.7G   1% /run
tmpfs                     7.7G     0  7.7G   0% /sys/fs/cgroup
/dev/mapper/RHELCSB-Root   50G   48G  2.6G  95% /
/dev/nvme0n1p2            3.0G  467M  2.6G  16% /boot
/dev/nvme0n1p1            200M   17M  184M   9% /boot/efi
/dev/mapper/RHELCSB-Home  100G   29G   72G  29% /home
tmpfs                     1.6G   56K  1.6G   1% /run/user/119637
[miyadav@miyadav ~]$ git clone git:openshift/cluster-api-actuator-pkg.git
Cloning into 'cluster-api-actuator-pkg'...
remote: Enumerating objects: 42267, done.
remote: Counting objects: 100% (4627/4627), done.
remote: Compressing objects: 100% (2212/2212), done.
remote: Total 42267 (delta 2198), reused 4446 (delta 2119), pack-reused 37640
Receiving objects: 100% (42267/42267), 63.52 MiB | 1.06 MiB/s, done.
Resolving deltas: 100% (19202/19202), done.
[miyadav@miyadav ~]$ df -h
Filesystem                Size  Used Avail Use% Mounted on
devtmpfs                  7.6G     0  7.6G   0% /dev
tmpfs                     7.7G  227M  7.4G   3% /dev/shm
tmpfs                     7.7G  2.0M  7.7G   1% /run
tmpfs                     7.7G     0  7.7G   0% /sys/fs/cgroup
/dev/mapper/RHELCSB-Root   50G   48G  2.6G  95% /
/dev/nvme0n1p2            3.0G  467M  2.6G  16% /boot
/dev/nvme0n1p1            200M   17M  184M   9% /boot/efi
/dev/mapper/RHELCSB-Home  100G   29G   72G  29% /home
tmpfs                     1.6G   56K  1.6G   1% /run/user/119637
[miyadav@miyadav ~]$ cd cluster-api-actuator-pkg/
[miyadav@miyadav cluster-api-actuator-pkg]$ NAMESPACE=kube-system ./hack/ci-integration.sh -focus "Webhooks" -v

You're using deprecated Ginkgo functionality:
=============================================
Ginkgo 2.0 is under active development and will introduce several new features, improvements, and a small handful of breaking changes.
A release candidate for 2.0 is now available and 2.0 should GA in Fall 2021.  Please give the RC a try and send us feedback!
  - To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md
  - For instructions on using the Release Candidate visit https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#using-the-beta
  - To comment, chime in at https://github.com/onsi/ginkgo/issues/711

  --stream is deprecated and will be removed in Ginkgo 2.0
  Learn more at: https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#removed--stream

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=1.16.5

I0829 10:57:31.672543   39384 request.go:601] Waited for 1.049373323s due to client-side throttling, not priority and fairness, request: GET:https://api.sgao-0.qe.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
Running Suite: Machine Suite
============================
Random Seed: 1661750808
Will run 4 of 37 specs

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[Feature:Machines] Webhooks 
  should be able to create a machine from a minimal providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:68

• [SLOW TEST:144.080 seconds]
[Feature:Machines] Webhooks
/home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:21
  should be able to create a machine from a minimal providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:68
------------------------------
[Feature:Machines] Webhooks 
  should be able to create machines from a machineset with a minimal providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:94
I0829 11:00:02.602291   39384 request.go:601] Waited for 1.050221197s due to client-side throttling, not priority and fairness, request: GET:https://api.sgao-0.qe.devcluster.openshift.com:6443/apis/whereabouts.cni.cncf.io/v1alpha1?timeout=32s

• [SLOW TEST:177.254 seconds]
[Feature:Machines] Webhooks
/home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:21
  should be able to create machines from a machineset with a minimal providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:94
------------------------------
[Feature:Machines] Webhooks 
  should return an error when removing required fields from the Machine providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:101
I0829 11:02:59.749116   39384 request.go:601] Waited for 1.000195504s due to client-side throttling, not priority and fairness, request: GET:https://api.sgao-0.qe.devcluster.openshift.com:6443/apis/storage.k8s.io/v1beta1?timeout=32s

• [SLOW TEST:22.425 seconds]
[Feature:Machines] Webhooks
/home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:21
  should return an error when removing required fields from the Machine providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:101
------------------------------
[Feature:Machines] Webhooks 
  should return an error when removing required fields from the MachineSet providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:135
I0829 11:03:22.535854   39384 request.go:601] Waited for 1.049392136s due to client-side throttling, not priority and fairness, request: GET:https://api.sgao-0.qe.devcluster.openshift.com:6443/apis/performance.openshift.io/v2?timeout=32s

• [SLOW TEST:23.756 seconds]
[Feature:Machines] Webhooks
/home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:21
  should return an error when removing required fields from the MachineSet providerSpec
  /home/miyadav/cluster-api-actuator-pkg/pkg/infra/webhooks.go:135
------------------------------

Ran 4 of 37 Specs in 371.092 seconds
SUCCESS! -- 4 Passed | 0 Failed | 0 Pending | 33 Skipped
PASS

You're using deprecated Ginkgo functionality:
=============================================
Ginkgo 2.0 is under active development and will introduce several new features, improvements, and a small handful of breaking changes.
A release candidate for 2.0 is now available and 2.0 should GA in Fall 2021.  Please give the RC a try and send us feedback!
  - To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md
  - For instructions on using the Release Candidate visit https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#using-the-beta
  - To comment, chime in at https://github.com/onsi/ginkgo/issues/711

  You are using a custom reporter.  Support for custom reporters will likely be removed in V2.  Most users were using them to generate junit or teamcity reports and this functionality will be merged into the core reporter.  In addition, Ginkgo 2.0 will support emitting a JSON-formatted report that users can then manipulate to generate custom reports.

  If this change will be impactful to you please leave a comment on https://github.com/onsi/ginkgo/issues/711
  Learn more at: https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#removed-custom-reporters

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=1.16.5


Ginkgo ran 1 suite in 6m56.348282187s
Test Suite Passed
[miyadav@miyadav cluster-api-actuator-pkg]



Additional info :
Before and after run , machineset remained same - 
[miyadav@miyadav cluster-api-actuator-pkg]$ oc get machineset -n openshift-machine-api
NAME                  DESIRED   CURRENT   READY   AVAILABLE   AGE
sgao-0-6v47t-worker   2         2         2       2           55m
[miyadav@miyadav cluster-api-actuator-pkg]$ oc get machineset -n openshift-machine-api
NAME                  DESIRED   CURRENT   READY   AVAILABLE   AGE
sgao-0-6v47t-worker   2         2         2       2           60m

Comment 5 errata-xmlrpc 2022-09-07 20:49:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.11.3 packages and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6287


Note You need to log in before you can comment on or make changes to this bug.