Bug 1928856

Summary: OCP Conformance test fails if MachineSet resource type is not present
Product: OpenShift Container Platform Reporter: Steve halverson <shalver>
Component: Cloud ComputeAssignee: Joel Speed <jspeed>
Cloud Compute sub component: Other Providers QA Contact: sunzhaohua <zhsun>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: aos-bugs, eparis, jokerman, mfojtik, mgugino, mstaeble
Version: 4.6   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-18 17:29:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve halverson 2021-02-15 16:52:38 UTC
Description of problem:


[sig-cluster-lifecycle][Feature:Machines][Early] Managed cluster should have same number of Machines and Nodes [Suite:openshift/conformance/parallel]

This test should be skipped if the machine and machineset resource types are not present in the environment.


Version-Release number of selected component (if applicable):


How reproducible:  always


Steps to Reproduce:
1. Run conformance test on cluster where machineset resource type is not present
2.
3.

Actual results:

I0212 15:02:09.424762 34349 test_context.go:459] Tolerating taints "node-role.kubernetes.io/master" when considering if nodes are ready
[BeforeEach] [Top Level]
/Users/shalver/github/origin/_output/local/go/src/github.com/openshift/origin/test/extended/util/framework.go:1440
[BeforeEach] [Top Level]
/Users/shalver/github/origin/_output/local/go/src/github.com/openshift/origin/test/extended/util/framework.go:1440
[BeforeEach] [Top Level]
/Users/shalver/github/origin/_output/local/go/src/github.com/openshift/origin/test/extended/util/test.go:59
[It] have same number of Machines and Nodes [Suite:openshift/conformance/parallel]
/Users/shalver/github/origin/_output/local/go/src/github.com/openshift/origin/test/extended/machines/cluster.go:24
STEP: getting MachineSet list
fail [github.com/openshift/origin/test/extended/machines/cluster.go:35]: Unexpected error:
<*errors.StatusError | 0xc0023cf7c0>: {
ErrStatus: {
TypeMeta: {Kind: "", APIVersion: ""},
ListMeta: {
SelfLink: "",
ResourceVersion: "",
Continue: "",
RemainingItemCount: nil,
},
Status: "Failure",
Message: "the server could not find the requested resource",
Reason: "NotFound",
Details: {
Name: "",
Group: "",
Kind: "",
UID: "",
Causes: [
{
Type: "UnexpectedServerResponse",
Message: "404 page not found",
Field: "",
},
],
RetryAfterSeconds: 0,
},
Code: 404,
},
}
the server could not find the requested resource
occurred



Expected results:

Test should be skipped


Additional info:

Comment 1 Stefan Schimanski 2021-02-19 12:56:31 UTC
@Sudha please assign this to the owners of the test.

Comment 2 Michal Fojtik 2021-02-19 14:34:51 UTC
Moving this to Cloud Compute component as that team own the cluster lifecycle tests. This indeed looks like a problem in MachineSet, but does not seem to related to API server, sorry for the confusion on initial bug triage.

Comment 3 Michael Gugino 2021-02-19 15:14:44 UTC
machine-api CRDs should always be installed as far as I know, even in UPI clusters because users can add machines later (with the exception of platform==None, the machine controller won't be deployed).

Need to know more about how this cluster was installed.  Moving over to the release team for now.

Comment 4 Steve halverson 2021-02-19 16:42:10 UTC
This is running in IBM public cloud.  Cesar Wong from Redhat indicated we could skip test for now as Machine and MachineSet not expected to be there.

"ROKS clusters should not contain MachineSets/Machines because machines are not managed via the machine api inside the cluster, but rather outside."

Would rather have the test auto-detect and skip.

Comment 5 Michael Gugino 2021-02-19 20:59:59 UTC
How is the suite being run?  The testing framework supports skipping tests, see this example: https://github.com/openshift/cluster-api-actuator-pkg/blob/master/Makefile#L67

Comment 6 Steve halverson 2021-02-24 14:20:01 UTC
We would rather run all the tests without explicitly skipping any.  The test should detect and skip if appropriate.

Comment 8 Matthew Staebler 2021-03-01 03:02:23 UTC
The installer team does not own this conformance test and the openshift-installer is not used in ROKS installations.

Comment 9 Joel Speed 2021-03-01 10:54:01 UTC
In some of the other tests in this suite, we have a check to see if we are running on IBM, we could copy this across to the particular test that is failing https://github.com/openshift/origin/blob/master/test/extended/machines/machines.go#L36-L38

Alternatively, we could write something that can be called at the beginning of each of the Machine tests and check whether Machines are installed or not. Longer term as we investigate different ways of running OpenShift, we may not always have Machines installed, so this may be a better long term solution.

Comment 14 sunzhaohua 2021-07-15 07:56:00 UTC
Move to verified
From the code we can see the test will skip if the Machine API is not installed in a cluster.

Comment 17 errata-xmlrpc 2021-10-18 17:29:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759