Description of problem: Submariner join failed: Deployment does not have minimum availability Version-Release number of selected component (if applicable): Submariner 0.8.0 How reproducible: Always on my d/s CI: https://qe-jenkins-csb-skynet.cloud.paas.psi.redhat.com/job/debug_job/901/Test-Report/ Steps to Reproduce: subctl join ./broker-info.subm --cable-driver libreswan --ikeport 501 --nattport 4501 --enable-pod-debugging --ipsec-debug --health-check --image-override submariner-operator=registry.gitlab.com/smattar/submariner-rhel8-operator:v0.8.0 --image-override submariner-gateway=registry.gitlab.com/smattar/submariner-gateway-rhel8:v0.8.0 --image-override submariner-route-agent=registry.gitlab.com/smattar/submariner-route-agent-rhel8:v0.8.0 --image-override submariner-globalnet=registry.gitlab.com/smattar/submariner-globalnet-rhel8:v0.8.0 --image-override submariner-networkplugin-syncer=registry.gitlab.com/smattar/submariner-networkplugin-syncer-rhel8:v0.8.0 --image-override lighthouse-agent=registry.gitlab.com/smattar/lighthouse-agent-rhel8:v0.8.0 --image-override lighthouse-coredns=registry.gitlab.com/smattar/lighthouse-coredns-rhel8:v0.8.0 Actual results: https://api.nmanos-cluster-a.devcluster.openshift.com:6443 • Discovering network details ... * There are 1 labeled nodes in the cluster: - ip-10-166-25-149.us-west-1.compute.internal ✓ Discovering network details Discovered network details: Network plugin: OpenShiftSDN Service CIDRs: [100.96.0.0/16] Cluster CIDRs: [10.252.0.0/14] • Discovering multi cluster details ... • Validating Globalnet configurations ... ✓ Validating Globalnet configurations • Assigning Globalnet IPs ... ✓ Assigning Globalnet IPs ✓ Allocated GlobalCIDR: 169.254.0.0/19 ✓ Discovering multi cluster details • Deploying the Submariner operator ... ✗ Deploying the Submariner operator ✓ Created operator CRDs ✓ Created operator namespace: submariner-operator ✓ Created operator service account and role ✓ Created lighthouse service account and role ✓ Created Lighthouse service accounts and roles Error deploying the operator: timed out waiting for the condition Pod logs show: image: registry.gitlab.com/smattar/submariner-rhel8-operator:v0.8.0 imagePullPolicy: Always name: submariner-operator resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: submariner-operator serviceAccountName: submariner-operator terminationGracePeriodSeconds: 30 status: conditions: - lastTransitionTime: "2020-12-28T09:17:41Z" lastUpdateTime: "2020-12-28T09:17:41Z" message: Deployment does not have minimum availability. reason: MinimumReplicasUnavailable status: "False" type: Available - lastTransitionTime: "2020-12-28T09:27:42Z" lastUpdateTime: "2020-12-28T09:27:42Z" message: ReplicaSet "submariner-operator-677668ff95" has timed out progressing. reason: ProgressDeadlineExceeded status: "False" type: Progressing observedGeneration: 1 replicas: 1 unavailableReplicas: 1 updatedReplicas: Expected results: Join should complete
G2Bsync 753781471 comment skeeey Mon, 04 Jan 2021 06:20:07 UTC G2Bsync this seems the operator pod cannot be created, would you increase the cpu/memory for the submariner operator?
This seems to be related to the images availability on the remote registry. In another attempt: subctl join ./broker-info.subm --cable-driver libreswan --ikeport 501 --nattport 4501 --health-check --image-override submariner-operator=registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.0 --image-override submariner=registry.redhat.io/rhacm2-tech-preview/submariner-gateway-rhel8:v0.8.0 16:44:21 * ./broker-info.subm says broker is at: https://api.nmanos-cluster-a.devcluster.openshift.com:6443 16:44:21 * There are 1 labeled nodes in the cluster: 16:44:21 ��� Discovering network details ... 16:44:21 - default-cl1-l5mpb-worker-8d6gh 16:44:21 ��� Discovering network details 16:44:21 Discovered network details: 16:44:21 Network plugin: OpenShiftSDN 16:44:21 Service CIDRs: [100.96.0.0/16] 16:44:21 Cluster CIDRs: [10.252.0.0/14] 16:44:22 ��� Discovering multi cluster details ... 16:44:22 ��� Validating Globalnet configurations ... 16:44:22 ��� Validating Globalnet configurations 16:44:22 ��� Assigning Globalnet IPs ... 16:44:22 ��� Assigning Globalnet IPs 16:44:22 ��� Allocated GlobalCIDR: 169.254.32.0/19 16:44:22 ��� Discovering multi cluster details 16:44:22 ��� Deploying the Submariner operator ... 16:54:22 ��� Deploying the Submariner operator 16:54:22 ��� Created operator CRDs 16:54:22 ��� Created operator service account and role 16:54:22 ��� Created lighthouse service account and role 16:54:22 ��� Created Lighthouse service accounts and roles 16:54:22 Error deploying the operator: timed out waiting for the condition 16:54:22 16:54:22 subctl version: v0.8.0 Looking at globalnet pod I see: 16:54:39 brokerK8sRemoteNamespace: submariner-k8s-broker 16:54:39 Cable Driver: libreswan 16:54:39 Ce IP Sec Debug: false 16:54:39 Ce IP Sec IKE Port: 501 16:54:39 Ce IP Sec NATT Port: 4501 16:54:39 Ce IP Sec PSK: up9ryrrxCxn3ngjyOrJvyKLO+mw6r+wTxbV/Nj/2njBOZmG08/yIbs8VbylC6Pjn 16:54:39 Cluster CIDR: 16:54:39 Cluster ID: nmanos-cluster-a 16:54:39 Color Codes: blue 16:54:39 Connection Health Check: 16:54:39 Enabled: true 16:54:39 Interval Seconds: 1 16:54:39 Max Packet Loss Count: 5 16:54:39 Debug: false 16:54:39 Global CIDR: 169.254.0.0/19 16:54:39 Image Overrides: 16:54:39 Submariner: registry.redhat.io/rhacm2-tech-preview/submariner-gateway-rhel8:v0.8.0 16:54:39 Submariner - Operator: registry.redhat.io/rhacm2-tech-preview/submariner-rhel8-operator:v0.8.0 16:54:39 Namespace: submariner-operator 16:54:39 Nat Enabled: true 16:54:39 Repository: quay.io/submariner 16:54:39 Service CIDR: 16:54:39 Service Discovery Enabled: true 16:54:39 Version: 0.8.0 16:54:39 Status: 16:54:39 Cluster CIDR: 10.252.0.0/14 16:54:39 Cluster ID: nmanos-cluster-a 16:54:39 Color Codes: blue 16:54:39 Engine Daemon Set Status: 16:54:39 Last Resource Version: 183930 16:54:39 Mismatched Container Images: false 16:54:39 Non Ready Container States: 16:54:39 Status: 16:54:39 Current Number Scheduled: 1 16:54:39 Desired Number Scheduled: 1 16:54:39 Number Available: 1 16:54:39 Number Misscheduled: 0 16:54:39 Number Ready: 1 16:54:39 Observed Generation: 1 16:54:39 Updated Number Scheduled: 1 16:54:39 Gateways: 16:54:39 Connections: 16:54:39 Ha Status: active 16:54:39 Local Endpoint: 16:54:39 Backend: libreswan 16:54:39 cable_name: submariner-cable-nmanos-cluster-a-10-166-80-167 16:54:39 cluster_id: nmanos-cluster-a 16:54:39 Health Check IP: 10.254.2.1 16:54:39 Hostname: ip-10-166-80-167 16:54:39 nat_enabled: true 16:54:39 private_ip: 10.166.80.167 16:54:39 public_ip: 13.57.57.11 16:54:39 Subnets: 16:54:39 169.254.0.0/19 16:54:39 Status Failure: 16:54:39 Version: 0.8.0 16:54:39 Global CIDR: 169.254.0.0/19 16:54:39 Globalnet Daemon Set Status: 16:54:39 Last Resource Version: 183753 16:54:39 Mismatched Container Images: false 16:54:39 Non Ready Container States: 16:54:39 Waiting: 16:54:39 Message: rpc error: code = Unknown desc = Error reading manifest 0.8.0 in quay.io/submariner/submariner-globalnet-rhel8: unauthorized: access to the requested resource is not authorized 16:54:39 Reason: ErrImagePull
To workaround this I used SubCtl DEVEL version (above 0.8.0).