Bug 1929345 - subctl join --image-override submariner-operator : All pods failed, except operator pod
Summary: subctl join --image-override submariner-operator : All pods failed, except op...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Submariner
Version: rhacm-2.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Steve Mattar
QA Contact: Noam Manos
Christopher Dawson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-16 17:28 UTC by Noam Manos
Modified: 2021-05-31 11:35 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-04 19:31:22 UTC
Target Upstream Version:
Embargoed:
smattar: rhacm-2.2.z+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 9515 0 None None None 2021-02-22 14:30:07 UTC
Github submariner-io submariner-operator pull 1094 0 None closed fix: parse operator image url enhancement 2021-02-18 00:01:47 UTC
Red Hat Product Errata RHEA-2021:1500 0 None None None 2021-05-04 19:31:26 UTC

Description Noam Manos 2021-02-16 17:28:22 UTC
Description of problem:
Joining two clusters with subctl using:
--image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.1

Has completed with no warning (exit code 0), but only the operator image was installed - all the other images failed with similar messages:

Failed to pull image "registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator/submariner-gateway-rhel8:v0.8.1": 

rpc error: code = Unknown desc = (Mirrors also failed: [image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-rhel8-operator/submariner-gateway-rhel8:v0.8.1: unable to retrieve auth token: 

invalid username/password: unauthorized: repository name "submariner-operator/submariner-rhel8-operator/submariner-gateway-rhel8" invalid: 

it must be of the format <project>/<name>]): registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator/submariner-gateway-rhel8:v0.8.1: 

Error reading manifest v0.8.1 in registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator/submariner-gateway-rhel8: unknown: Not Found
  


Version-Release number of selected component (if applicable):
Submariner 0.8.1

How reproducible:
Always

Steps to Reproduce:
https://qe-jenkins-csb-skynet.cloud.paas.psi.redhat.com/job/Maintenance/job/debug_job/1148/Test-Report/

Actual results:
(full output in test report ^)

$ subctl join   ./broker-info.subm --cable-driver libreswan   --ikeport 501 --nattport 4501 --health-check --enable-pod-debugging --ipsec-debug --image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.1
https://api.nmanos-cluster-a.devcluster.openshift.com:6443
 • Discovering network details  ...
* There are 1 labeled nodes in the cluster:
  - ip-10-166-124-130.us-west-1.compute.internal
 ✓ Discovering network details
    Discovered network details:
        Network plugin:  OpenShiftSDN
        Service CIDRs:   [100.96.0.0/16]
        Cluster CIDRs:   [10.252.0.0/14]
        Global CIDR:     169.254.0.0/19
 • Discovering multi cluster details  ...
 • Validating Globalnet configurations  ...
 ✓ Validating Globalnet configurations
 • Assigning Globalnet IPs  ...
 ✓ Assigning Globalnet IPs
 ⚠ Cluster already has GlobalCIDR allocated: 169.254.0.0/19
 ✓ Discovering multi cluster details
 • Deploying the Submariner operator  ...
 ✓ Deploying the Submariner operator
 ✓ Created Lighthouse service accounts and roles
 • Creating SA for cluster  ...
 ✓ Creating SA for cluster
 • Deploying Submariner  ...
 ✓ Deploying Submariner
 ✓ Submariner is up and running


$ subctl join   ./broker-info.subm --cable-driver libreswan   --ikeport 501 --nattport 4501 --health-check --enable-pod-debugging --ipsec-debug --image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.1
(conda-test) subctl join   ./broker-info.subm --cable-driver libreswan   --ikeport 501 --nattport 4501 --health-check --enable-pod-debugging --ipsec-debug --image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.1
https://api.nmanos-cluster-a.devcluster.openshift.com:6443
 • Discovering network details  ...
* There are 1 labeled nodes in the cluster:
  - default-cl1-hk655-worker-4wbkh
 ✓ Discovering network details
    Discovered network details:
        Network plugin:  OpenShiftSDN
        Service CIDRs:   [100.96.0.0/16]
        Cluster CIDRs:   [10.252.0.0/14]
 • Discovering multi cluster details  ...
 • Validating Globalnet configurations  ...
 ✓ Validating Globalnet configurations
 • Assigning Globalnet IPs  ...
 ✓ Assigning Globalnet IPs
 ✓ Allocated GlobalCIDR: 169.254.32.0/19
 ✓ Discovering multi cluster details
 • Deploying the Submariner operator  ...
 ✓ Deploying the Submariner operator
 ✓ Created operator CRDs
 ✓ Created operator service account and role
 ✓ Created lighthouse service account and role
 ✓ Created Lighthouse service accounts and roles
 ✓ Deployed the operator successfully
 • Creating SA for cluster  ...
 ✓ Creating SA for cluster
 • Deploying Submariner  ...
 ✓ Deploying Submariner
 ✓ Submariner is up and running

$ subctl show all

Showing information for cluster "nmanos-cluster-a":
Showing Network details
    Discovered network details:
        Network plugin:  OpenShiftSDN
        Service CIDRs:   [100.96.0.0/16]
        Cluster CIDRs:   [10.252.0.0/14]
        Global CIDR:     169.254.0.0/19


Showing Endpoint details
No resources found.

Showing Connection details
No resources found.

Showing Gateway details
No resources found.

Showing version details
COMPONENT                       REPOSITORY                                            VERSION         
submariner                      registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1          
submariner-operator             registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1          
service-discovery               registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1          

Showing information for cluster "default-cl1":
Showing Network details
    Discovered network details:
        Network plugin:  OpenShiftSDN
        Service CIDRs:   [100.96.0.0/16]
        Cluster CIDRs:   [10.252.0.0/14]
        Global CIDR:     169.254.32.0/19


Showing Endpoint details
No resources found.

Showing Connection details
No resources found.

Showing Gateway details
No resources found.

Showing version details
COMPONENT                       REPOSITORY                                            VERSION         
submariner                      registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1          
submariner-operator             registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1          
service-discovery               registry.redhat.io/rhacm2-tech-preview/submariner-rhe v0.8.1       

$ oc get all -n submariner-operator --show-labels
NAME                                                READY   STATUS             RESTARTS   AGE     LABELS
pod/submariner-gateway-9kd7c                        0/1     ImagePullBackOff   0          5m20s   app=submariner-engine,controller-revision-hash=65c4ccd6f,pod-template-generation=1
pod/submariner-globalnet-kwzpb                      0/1     ImagePullBackOff   0          5m16s   app=submariner-globalnet,component=globalnet,controller-revision-hash=755c699d9b,pod-template-generation=1
pod/submariner-lighthouse-agent-64b578c887-mmmqm    0/1     ImagePullBackOff   0          5m15s   app=submariner-lighthouse-agent,component=submariner-lighthouse,pod-template-hash=64b578c887
pod/submariner-lighthouse-coredns-54894df67-nmpkn   0/1     ImagePullBackOff   0          5m15s   app=submariner-lighthouse-coredns,component=submariner-lighthouse,pod-template-hash=54894df67
pod/submariner-lighthouse-coredns-54894df67-qgf7c   0/1     ImagePullBackOff   0          5m15s   app=submariner-lighthouse-coredns,component=submariner-lighthouse,pod-template-hash=54894df67
pod/submariner-operator-647c7495fc-426jb            1/1     Running            0          30m     name=submariner-operator,pod-template-hash=647c7495fc
pod/submariner-routeagent-9k44z                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-9vkp2                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-fshpq                     0/1     ImagePullBackOff   0          5m17s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-hz55r                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-jgp4h                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-rq9bm                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1
pod/submariner-routeagent-xccf7                     0/1     ImagePullBackOff   0          5m16s   app=submariner-routeagent,component=routeagent,controller-revision-hash=86869ddf8d,pod-template-generation=1

NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE     LABELS
service/submariner-engine-metrics       ClusterIP   100.96.86.152    <none>        8080/TCP            5m20s   app=submariner-engine
service/submariner-lighthouse-coredns   ClusterIP   100.96.254.251   <none>        53/UDP              5m16s   app=submariner-lighthouse-coredns,component=submariner-lighthouse
service/submariner-operator-metrics     ClusterIP   100.96.193.162   <none>        8383/TCP,8686/TCP   89m     name=submariner-operator

NAME                                   DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                AGE     LABELS
daemonset.apps/submariner-gateway      1         1         0       1            0           submariner.io/gateway=true   5m21s   app=submariner-engine,component=engine
daemonset.apps/submariner-globalnet    1         1         0       1            0           submariner.io/gateway=true   5m18s   app=submariner-globalnet,component=globalnet
daemonset.apps/submariner-routeagent   7         7         0       7            0           <none>                       5m18s   app=submariner-routeagent,component=routeagent

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE     LABELS
deployment.apps/submariner-lighthouse-agent     0/1     1            0           5m17s   app=submariner-lighthouse-agent,component=submariner-lighthouse
deployment.apps/submariner-lighthouse-coredns   0/2     2            0           5m17s   app=submariner-lighthouse-coredns,component=submariner-lighthouse
deployment.apps/submariner-operator             1/1     1            1           89m     <none>

NAME                                                      DESIRED   CURRENT   READY   AGE     LABELS
replicaset.apps/submariner-lighthouse-agent-64b578c887    1         1         0       5m17s   app=submariner-lighthouse-agent,component=submariner-lighthouse,pod-template-hash=64b578c887
replicaset.apps/submariner-lighthouse-coredns-54894df67   2         2         0       5m17s   app=submariner-lighthouse-coredns,component=submariner-lighthouse,pod-template-hash=54894df67
replicaset.apps/submariner-operator-647c7495fc            1         1         1       89m     name=submariner-operator,pod-template-hash=647c7495fc

NAME                                                                   IMAGE REPOSITORY                                                                                             TAGS     UPDATED       LABELS
imagestream.image.openshift.io/lighthouse-agent-rhel8                  image-registry.openshift-image-registry.svc:5000/submariner-operator/lighthouse-agent-rhel8                  v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/lighthouse-coredns-rhel8                image-registry.openshift-image-registry.svc:5000/submariner-operator/lighthouse-coredns-rhel8                v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/submariner-gateway-rhel8                image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-gateway-rhel8                v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/submariner-globalnet-rhel8              image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-globalnet-rhel8              v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/submariner-networkplugin-syncer-rhel8   image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-networkplugin-syncer-rhel8   v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/submariner-rhel8-operator               image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-rhel8-operator               v0.8.1   2 hours ago   <none>
imagestream.image.openshift.io/submariner-route-agent-rhel8            image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-route-agent-rhel8            v0.8.1   2 hours ago   <none>

Comment 1 Noam Manos 2021-02-17 08:21:46 UTC
Comparing the images pull process, for example between the operator image, and the globalnet image.

### Operator image succeeded:

$ oc  import-image -n submariner-operator submariner-rhel8-operator:v0.8.1 --from=brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-rhel8-operator:v0.8.1 --confirm

imagestream.image.openshift.io/submariner-rhel8-operator imported

Name:			submariner-rhel8-operator
Namespace:		submariner-operator
Created:		Less than a second ago
Labels:			<none>
Image Repository:	image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-rhel8-operator

v0.8.1
  tagged from brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-rhel8-operator:v0.8.1

Image Name:	submariner-rhel8-operator:v0.8.1
Docker Image:	brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-rhel8-operator@sha256:614a364389e70e1bd0122b7355d2ee04bad3f1ca17c7fef9f6b23d34f99a9ac9
Created:	Less than a second ago
Image Size:	53.81MB in 3 layers
Image Created:	3 days ago
Entrypoint:	/usr/local/bin/submariner-operator
Working Dir:	<none>
User:		1001010000

### Pod submariner-operator-647c7495fc-8h7wp in Namespace submariner-operator ###

Name:               submariner-operator-647c7495fc-8h7wp
Namespace:          submariner-operator

Status:             Running
IP:                 10.253.2.12
Controlled By:      ReplicaSet/submariner-operator-647c7495fc
Containers:
  submariner-operator:
    Container ID:  cri-o://1f38a2f3872f614388f7ff209324bab07db4138d90b7b93908d786f8b63d3147
    Image:         registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.1
    Image ID:      registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator@sha256:614a364389e70e1bd0122b7355d2ee04bad3f1ca17c7fef9f6b23d34f99a9ac9



### But Globalnet image failed:

# Note the difference in "Entrypoint" script, and in the Pod container "Image", and empty "Image ID":

oc  import-image -n submariner-operator submariner-gateway-rhel8:v0.8.1 --from=brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-gateway-rhel8:v0.8.1 --confirm
imagestream.image.openshift.io/submariner-gateway-rhel8 imported

Name:			submariner-gateway-rhel8
Namespace:		submariner-operator
Created:		Less than a second ago
Labels:			<none>
Image Repository:	image-registry.openshift-image-registry.svc:5000/submariner-operator/submariner-gateway-rhel8

v0.8.1
  tagged from brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-gateway-rhel8:v0.8.1

Image Name:	submariner-gateway-rhel8:v0.8.1
Docker Image:	brew.registry.redhat.io/rh-osbs/rhacm2-tech-preview-submariner-gateway-rhel8@sha256:c8dc48bf44179e4ded7c50188926c349b2e302f5290f038aca959129c1fe834e
Created:	Less than a second ago
Image Size:	105MB in 3 layers
Image Created:	5 days ago
Entrypoint:	/usr/local/bin/submariner.sh
Working Dir:	<none>
User:		<none>

### Pod submariner-gateway-sw542 in Namespace submariner-operator ###

Name:               submariner-gateway-sw542
Namespace:          submariner-operator

Status:             Pending
IP:                 10.166.103.128
Controlled By:      DaemonSet/submariner-gateway
Containers:
  submariner:
    Container ID:  
    Image:         registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator/submariner-gateway-rhel8:v0.8.1
    Image ID:

Comment 2 Noam Manos 2021-02-17 08:23:24 UTC
^ Comparing operator image pull and GATEWAY image pull (not Globalnet)

Comment 3 Steve Mattar 2021-02-18 00:00:07 UTC
The problem was related to the operator image url parsing during the join command
This issue happens only when working with subctl
fixed on the u/s https://github.com/submariner-io/submariner-operator/pull/1094

Comment 9 errata-xmlrpc 2021-05-04 19:31:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHEA: Submariner 0.8 - bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1500


Note You need to log in before you can comment on or make changes to this bug.