1928805 – subctl e2e fails on first test, but message is misleading

Bug 1928805 - subctl e2e fails on first test, but message is misleading

Summary: subctl e2e fails on first test, but message is misleading

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Advanced Cluster Management for Kubernetes
Classification:	Red Hat
Component:	Submariner
Sub Component:
Version:	rhacm-2.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	tpanteli
QA Contact:	Noam Manos
Docs Contact:	Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-02-15 15:28 UTC by Noam Manos
Modified:	2021-05-31 11:34 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-04 19:31:22 UTC
Target Upstream Version:
Embargoed:
Flags:	smattar: rhacm-2.2.z+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	open-cluster-management backlog issues 9484	None	None	None	2021-02-22 14:20:29 UTC
Github	submariner-io shipyard pull 456	None	closed	Improve E2E cluster ID check	2021-02-21 11:40:16 UTC
Red Hat Product Errata	RHEA-2021:1500	None	None	None	2021-05-04 19:31:26 UTC

Description Noam Manos 2021-02-15 15:28:33 UTC

Description of problem:
subctl e2e asumes kubeconfig context name is the same as cluster ID, while it may be different, and in that case - fails.

Version-Release number of selected component (if applicable):
Submariner 0.8.1

How reproducible:
Always

Steps to Reproduce:

https://qe-jenkins-csb-skynet.cloud.paas.psi.redhat.com/job/Maintenance/job/debug_job/1136/Test-Report/


Actual results:

$ oc  config view

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: api-pkomarov-cluster-a-devcluster-openshift-com:6443
- cluster:
    certificate-authority-data: DATA+OMITTED
https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: pkomarov-cluster-a
contexts:
- context:
    cluster: api-pkomarov-cluster-a-devcluster-openshift-com:6443
    namespace: default
    user: master/api-pkomarov-cluster-a-devcluster-openshift-com:6443
  name: default/api-pkomarov-cluster-a-devcluster-openshift-com:6443/master
- context:
    cluster: pkomarov-cluster-a
    namespace: default
    user: admin
  name: pkomarov-cluster-a
current-context: pkomarov-cluster-a
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED
- name: master/api-pkomarov-cluster-a-devcluster-openshift-com:6443
  user:
    token: REDACTED

$ oc  config view

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
https://api.default-cl2.devcluster.openshift.com:6443
  name: api-default-cl2-devcluster-openshift-com:6443
- cluster:
    certificate-authority-data: DATA+OMITTED
https://api.default-cl2.devcluster.openshift.com:6443
  name: default-cl2
contexts:
- context:
    cluster: api-default-cl2-devcluster-openshift-com:6443
    namespace: default
    user: master/api-default-cl2-devcluster-openshift-com:6443
  name: default/api-default-cl2-devcluster-openshift-com:6443/master
- context:
    cluster: default-cl2
    namespace: default
    user: admin
  name: pkomarov-cluster-b
current-context: pkomarov-cluster-b
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED
- name: master/api-default-cl2-devcluster-openshift-com:6443
  user:
    token: REDACTED
- name: ocp_usr/api-default-cl2-devcluster-openshift-com:6443
  user: {}


$ subctl verify --only service-discovery,connectivity --verbose /mnt/skynet-data/pkomarov-env/pkomarov-cluster-a/auth/kubeconfig /mnt/skynet-data/pkomarov-env/ocpup/.config/cl2/auth/kubeconfig
 
Performing the following verifications: service-discovery, connectivity
Running Suite: Submariner E2E suite
===================================
Random Seed: 1613399570
Will run 32 of 34 specs

STEP: Creating kubernetes clients
STEP: Setting cluster ID "pkomarov-cluster-a" for kube context name "pkomarov-cluster-a"
STEP: Setting cluster ID "default-cl2" for kube context name "default-cl2"
STEP: Creating lighthouse clients
[] E2E failed

subctl version: v0.8.1

Failure [245.390 seconds]
[BeforeSuite] BeforeSuite 
/go/src/github.com/submariner-io/submariner-operator/vendor/github.com/submariner-io/shipyard/test/e2e/e2e.go:24

  Failed to find Clusters to detect if Globalnet is enabled. No Cluster found
  Unexpected error:
      <*errors.errorString | 0xc000404200>: {
          s: "timed out waiting for the condition",
      }
      timed out waiting for the condition
  occurred

  /go/src/github.com/submariner-io/submariner-operator/vendor/github.com/submariner-io/shipyard/test/e2e/framework/framework.go:458
------------------------------

Ran 32 of 0 Specs in 245.390 seconds
FAIL! -- 0 Passed | 32 Failed | 0 Pending | 0 Skipped


Expected results:
subctl e2e framework (shipyard) should use the correct kubeconfig context name:
(for cluster B the name is: "pkomarov-cluster-b" and not "default-cl2"

Comment 1 tpanteli 2021-02-15 19:18:04 UTC

The error “Failed to find Clusters to detect if Globalnet is enabled. No Cluster found” means that no Cluster resource was found in ClusterA (the first cluster passed in which I assume was "pkomarov-cluster-a"). This is not related to the assertion in the issue title ("assumes kubeconfig context name is the same as cluster ID"). It actually no longer assumes that, ie it obtains the cluster ID from the SUBMARINER_CLUSTERID env var of the DaemonSet Spec for the gateway. This is reflected in the message "STEP: Setting cluster ID "default-cl2" for kube context name "default-cl2"", which indicates the kube context name and obtained cluster ID are one and the same for clusterB. However the cluster ID/name is only used for display in messages except for one case in LH E2E where it's used to obtain the health check IP.

Comment 2 Noam Manos 2021-02-16 12:46:27 UTC

But "STEP: Setting cluster ID "default-cl2" for kube context name "default-cl2"" is indicating that e2e evaluates wrong data -
There's no such context name "default-cl2", but only cluster id "default-cl2":

- context:
    cluster: default-cl2
    namespace: default
    user: admin
  name: pkomarov-cluster-b

Comment 3 tpanteli 2021-02-16 12:54:02 UTC

The context name comes from what you pass in on the command line, presumably /mnt/skynet-data/pkomarov-env/ocpup/.config/cl2/auth/kubeconfig.

Comment 4 tpanteli 2021-02-16 13:18:01 UTC

Actually it is correct, ie extracts the Cluster field from the current context in the config file. TestContext.ClusterIDs is intended to be the cluster ID/name and not the context name as it's used for display in output messages. When running e2e from the make target, TestContext.ClusterIDs is initialized to the context name passed in.

Comment 5 tpanteli 2021-02-16 13:29:47 UTC

(In reply to tpanteli from comment #4)
> Actually it is correct, ie extracts the Cluster field from the current
> context in the config file. TestContext.ClusterIDs is intended to be the
> cluster ID/name and not the context name as it's used for display in output
> messages. When running e2e from the make target, TestContext.ClusterIDs is
> initialized to the context name passed in.

To clarify, the functionality is correct, ie TestContext.ClusterIDs is set correctly, but the message is misleading. The format params for the message are reversed although in this case it doesn't matter:

    By(fmt.Sprintf("Setting cluster ID %q for kube context name %q", TestContext.ClusterIDs[i], envVar.Value))

Also we shouldn't print the message if both values are the same.

Comment 6 tpanteli 2021-02-16 16:24:33 UTC

(In reply to tpanteli from comment #5)
> (In reply to tpanteli from comment #4)
> > Actually it is correct, ie extracts the Cluster field from the current
> > context in the config file. TestContext.ClusterIDs is intended to be the
> > cluster ID/name and not the context name as it's used for display in output
> > messages. When running e2e from the make target, TestContext.ClusterIDs is
> > initialized to the context name passed in.
> 
> To clarify, the functionality is correct, ie TestContext.ClusterIDs is set
> correctly, but the message is misleading. The format params for the message
> are reversed although in this case it doesn't matter:
> 
>     By(fmt.Sprintf("Setting cluster ID %q for kube context name %q",
> TestContext.ClusterIDs[i], envVar.Value))
> 
> Also we shouldn't print the message if both values are the same.

Submitted https://github.com/submariner-io/shipyard/pull/456

Comment 7 Noam Manos 2021-02-21 10:09:26 UTC

The root cause of the failure was Submariner installation failure, which lead to E2E fail on first test.
The message printed was misleading, there should probably be a Ginkgo "BeforeTest" step, that verifies that submariner is uninstall
(e.g. as subctl show all would return "Submariner is not installed" - e2e should do the same).

Comment 8 Noam Manos 2021-02-21 10:11:31 UTC

* stet that verifies that submariner is not installed

Comment 11 Noam Manos 2021-04-22 11:41:20 UTC

Verified:
https://qe-jenkins-csb-skynet.cloud.paas.psi.redhat.com/job/Submariner-0.8-OSP-AWS-Globalnet/180/Test-Report/

Comment 15 errata-xmlrpc 2021-05-04 19:31:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHEA: Submariner 0.8 - bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1500

Note You need to log in before you can comment on or make changes to this bug.