1972703 – Subctl fails to join cluster, since it cannot auto-generate a valid cluster id

Bug 1972703 - Subctl fails to join cluster, since it cannot auto-generate a valid cluster id

Summary: Subctl fails to join cluster, since it cannot auto-generate a valid cluster id

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Advanced Cluster Management for Kubernetes
Classification:	Red Hat
Component:	Submariner
Sub Component:
Version:	rhacm-2.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	rhacm-2.3
Assignee:	Nir Yechiel
QA Contact:	Noam Manos
Docs Contact:	Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-16 12:56 UTC by Noam Manos
Modified:	2021-08-06 00:53 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-08-06 00:52:39 UTC
Target Upstream Version:
Embargoed:
Flags:	ming: rhacm-2.3+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	open-cluster-management backlog issues 13415	0	None	None	None	2021-06-16 15:39:11 UTC
Red Hat Product Errata	RHSA-2021:3016	0	None	None	None	2021-08-06 00:53:10 UTC

Description Noam Manos 2021-06-16 12:56:32 UTC

**What happened**:

$ export KUBECONFIG=/mnt/skynet-data/pkomarov-env/pkomarov-cluster-a/auth/kubeconfig


$ subctl join   ./broker-info.subm    --ikeport 502 --nattport 4502 --health-check --pod-debug --ipsec-debug --image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.9
* ./broker-info.subm says broker is at: https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
? What is your cluster ID? 
subctl version: v0.9

panic: EOF

goroutine 1 [running]:
github.com/submariner-io/submariner-operator/pkg/subctl/cmd/utils.PanicOnError(0x2114400, 0xc000174060)
	/remote-source/app/pkg/subctl/cmd/utils/utils.go:34 +0x1bf
github.com/submariner-io/submariner-operator/pkg/subctl/cmd.panicOnError(...)
	/remote-source/app/pkg/subctl/cmd/root.go:91
github.com/submariner-io/submariner-operator/pkg/subctl/cmd.joinSubmarinerCluster(0x2156320, 0xc000a60fa0, 0x0, 0x0, 0xc000a60f00)
	/remote-source/app/pkg/subctl/cmd/join.go:227 +0xf65
github.com/submariner-io/submariner-operator/pkg/subctl/cmd.glob..func5(0x2ebb2c0, 0xc000778640, 0x1, 0xa)
	/remote-source/app/pkg/subctl/cmd/join.go:170 +0x2b3
github.com/spf13/cobra.(*Command).execute(0x2ebb2c0, 0xc000778500, 0xa, 0xa, 0x2ebb2c0, 0xc000778500)
	/remote-source/deps/gomod/pkg/mod/github.com/spf13/cobra.1/command.go:854 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x2ebb020, 0x44beea, 0x2dc08c0, 0xc000000180)
	/remote-source/deps/gomod/pkg/mod/github.com/spf13/cobra.1/command.go:958 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	/remote-source/deps/gomod/pkg/mod/github.com/spf13/cobra.1/command.go:895
github.com/submariner-io/submariner-operator/pkg/subctl/cmd.Execute(...)
	/remote-source/app/pkg/subctl/cmd/root.go:58
main.main()
	/remote-source/app/pkg/subctl/main.go:27 +0x32


**What you expected to happen**:

As sated in Docs:
https://submariner.io/operations/deployment/subctl/#join

--clusterid <string> 	

Cluster ID used to identify the tunnels. Every cluster needs to have a unique cluster ID. If not provided, one will be generated by default based on the cluster name in the kubeconfig file


**How to reproduce it (as minimally and precisely as possible)**:
https://qe-jenkins-csb-skynet.apps.ocp4.prod.psi.redhat.com/job/Submariner-0.9-AWSx2-OVN/Test-Report/

**Anything else we need to know?**:

It used to work up until 18/05 - u/s commit id #93e142b326d19cbdfe66d31c5a3f8f2933f0d6a3 

**Environment**:

### OCP Cluster pkomarov-cluster-a ###
Client Version: 4.7.16
Server Version: 4.7.16
Kubernetes Version: v1.20.0+2817867


### OCP Cluster pkomarov-cluster-c ###
Client Version: 4.7.16
Server Version: 4.7.16
Kubernetes Version: v1.20.0+2817867


### Submariner components ###

subctl version: v0.9


### submariner-rhel8-operator Image ###
id=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator@sha256:aaa192ede09a9b837bc6eee8fc0193170b920f88bf1fdd26484715fbcb49e494
name=rhacm2-tech-preview/submariner-rhel8-operator
release=87
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-rhel8-operator/images/v0.9-87
version=v0.9

### ImageStream Tags (in namespace submariner-operator) ###

### lighthouse-agent-rhel8:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/lighthouse-agent-rhel8
release=33
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/lighthouse-agent-rhel8/images/v0.9-33
version=v0.9

### lighthouse-coredns-rhel8:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/lighthouse-coredns-rhel8
release=33
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/lighthouse-coredns-rhel8/images/v0.9-33
version=v0.9

### submariner-gateway-rhel8:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/submariner-gateway-rhel8
release=61
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-gateway-rhel8/images/v0.9-61
version=v0.9

### submariner-globalnet-rhel8:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/submariner-globalnet-rhel8
release=54
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-globalnet-rhel8/images/v0.9-54
version=v0.9

### submariner-networkplugin-syncer-rhel8:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/submariner-networkplugin-syncer-rhel8
release=54
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-networkplugin-syncer-rhel8/images/v0.9-54
version=v0.9

### submariner-operator-bundle:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/submariner-operator-bundle
release=106
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-operator-bundle/images/v0.9-106
version=v0.9

### submariner-rhel8-operator:v0.9 Image-Stream tag ###
name=rhacm2-tech-preview/submariner-rhel8-operator
release=87
url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhacm2-tech-preview/submariner-rhel8-operator/images/v0.9-87
version=v0.9

Comment 1 Stephen Kitt 2021-06-16 13:26:30 UTC

Could you attach the kubeconfig used?

Comment 2 Noam Manos 2021-06-16 13:32:47 UTC

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: ...
    server: https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: api-pkomarov-cluster-a-devcluster-openshift-com:6443
- cluster:
    certificate-authority-data: ...
    server: https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: pkomarov-cluster-a
contexts:
- context:
    cluster: pkomarov-cluster-a
    namespace: default
    user: admin
  name: pkomarov-cluster-a
current-context: pkomarov-cluster-a
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: ...
    client-key-data: ...
- name: master/api-pkomarov-cluster-a-devcluster-openshift-com:6443
  user:
    token: sha256~...

Comment 3 Noam Manos 2021-06-16 14:04:09 UTC

Note that the kubeconfig is as in previous comment, 
but `oc config view` shows another context "pkomarov-cluster-a_old":

$ oc config view

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: api-pkomarov-cluster-a-devcluster-openshift-com:6443
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://api.pkomarov-cluster-a.devcluster.openshift.com:6443
  name: pkomarov-cluster-a
contexts:
- context:
    cluster: api-pkomarov-cluster-a-devcluster-openshift-com:6443
    user: master/api-pkomarov-cluster-a-devcluster-openshift-com:6443
  name: default/api-pkomarov-cluster-a-devcluster-openshift-com:6443/master
- context:
    cluster: pkomarov-cluster-a
    namespace: default
    user: admin
  name: pkomarov-cluster-a
- context:
    cluster: pkomarov-cluster-a
    namespace: default
    user: admin
  name: pkomarov-cluster-a_old
current-context: default/api-pkomarov-cluster-a-devcluster-openshift-com:6443/master
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED
- name: master/api-pkomarov-cluster-a-devcluster-openshift-com:6443
  user:
    token: REDACTED

Comment 4 Stephen Kitt 2021-06-16 15:27:13 UTC

What’s happening here is that the detected cluster name is “api-pkomarov-cluster-a-devcluster-openshift-com:6443”, which isn’t a valid cluster id (it can’t contain colons). So subctl asks the user for a valid cluster id.

Comment 5 Noam Manos 2021-06-17 13:44:55 UTC

This is a valid bug, since the cluster context name is a valid name:
"api-pkomarov-cluster-a-devcluster-openshift-com:6443"

It was generated after `oc login` with a new user "master", of type "HTPasswd" identity provider.

Please fix subctl join command to deal with such context name.

Comment 6 Stephen Kitt 2021-06-17 14:18:06 UTC

(In reply to Noam Manos from comment #5)
> This is a valid bug, since the cluster context name is a valid name:
> "api-pkomarov-cluster-a-devcluster-openshift-com:6443"

It’s a valid context name, but it’s not a valid cluster id.

cluster IDs must be valid DNS-1123 names, with only lowercase alphanumerics, '.' or '-' (and the first and last characters must be alphanumerics).

We use the context name by default because that’s guaranteed to be locally unique; however if we start converting context names to avoid problematic characters, we will lose that guarantee.

Comment 7 Stephen Kitt 2021-06-17 14:29:56 UTC

https://github.com/submariner-io/submariner-operator/pull/1424 will at least show a more explicit error message:

$ bin/subctl join --clusterid=test:123 output/broker-info.subm
* output/broker-info.subm says broker is at: https://172.18.0.5:6443
Error: cluster IDs must be valid DNS-1123 names, with only lowercase alphanumerics,
'.' or '-' (and the first and last characters must be alphanumerics).
test:123 doesn't meet these requirements
? What is your cluster ID?

Comment 8 Noam Manos 2021-06-17 14:34:57 UTC

(In reply to Stephen Kitt from comment #6)
> It’s a valid context name, but it’s not a valid cluster id.

I did not specify --clusterid <invalid cluster id> in the join command, 
as I've expected it to be auto-generated (according to https://submariner.io/operations/deployment/subctl/#join).

Comment 9 Stephen Kitt 2021-06-17 16:19:42 UTC

(In reply to Noam Manos from comment #8)
> (In reply to Stephen Kitt from comment #6)
> > It’s a valid context name, but it’s not a valid cluster id.
> 
> I did not specify --clusterid <invalid cluster id> in the join command, 
> as I've expected it to be auto-generated (according to
> https://submariner.io/operations/deployment/subctl/#join).

I know you didn’t. When a cluster id is not specified, subctl attempts to auto-generate one based on the context name. If it can’t, it asks the user to provide one.

Comment 10 Stephen Kitt 2021-06-29 12:20:54 UTC

The docs have been updated, see https://submariner.io/operations/deployment/subctl/#join-flags-general

Is that sufficient?

Comment 11 Noam Manos 2021-07-07 13:35:49 UTC

Docs looks clear now:

--clusterid <string> 	Cluster ID used to identify the tunnels. Every cluster needs to have a unique cluster ID. If not provided, one will be generated by default based on the cluster name in the kubeconfig file; if the cluster name is not a valid cluster ID, the user will be prompted for one

Closing issue.

Comment 14 errata-xmlrpc 2021-08-06 00:52:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management for Kubernetes version 2.3), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3016

Note You need to log in before you can comment on or make changes to this bug.