- **Description of problem:** The operator pod is failing to start when the VPC tag `kubernetes.io/cluster/<infraID>=.*` is not set in clusters installed in existing VPC (IPI). According to the documentation[0]: ~~~ - The VPC must not use the kubernetes.io/cluster/.*: owned tag. The installation program modifies your subnets to add the kubernetes.io/cluster/.*: shared tag, so your subnets must have at least one free tag slot available for it. See Tag Restrictions in the AWS documentation to confirm that the installation program can add a tag to each subnet that you specify. ~~~ So the VPC was created without this tag. When the ELB Operator was installed[1], the following error is raised in the logs: ~~~ $ oc logs pod/aws-load-balancer-operator-controller-manager-7d6c65fcc8-rh64h -n aws-load-balancer-operator I0708 15:14:22.268164 1 request.go:601] Waited for 1.041569483s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/cloud.network.openshift.io/v1?timeout=32s 1.6572932639217587e+09 INFO controller-runtime.metrics Metrics server is starting to listen {"addr": "127.0.0.1:8080"} 1.6572932640037425e+09 ERROR setup failed to get VPC ID {"error": "no VPC with tag \"kubernetes.io/cluster/lzdemo-7k427\" found"} main.main /workspace/main.go:133 runtime.main /usr/local/go/src/runtime/proc.go:255 ~~~ When I set the required tag and recycle the pod the installation finished successfully. - **OpenShift release version:** 4.11.0-rc.1 - **Cluster Platform:** AWS (IPI) - **How reproducible:** Always - **Steps to Reproduce (in detail):** 1. Create the VPC without the tag `kubernetes.io/cluster/<infraID>=shared`. Create the network dependencies (Subnet, route tables, Nats, etc) 2. Create the install-config.yaml, adding the subnets previously created 3. Create the cluster 4. Setup the [Local Development](https://github.com/openshift/aws-load-balancer-operator#local-development) 5. Check the operator logs - **Actual results:** Error on the operator logs mentioned above. - **Expected results:** Clear guidance on how to create a cluster in existing VPC without impacting the installation of the ELB Operator. * Operator start without expecting the VPC cluster tag, according to our documentation * Or review the approach in our documentation - **Impact of the problem:** There is a work in progress to provide guidance on how to install OpenShift clusters in existing VPCs with subnets in the Local Zones[2]. The ELB Operator is a key component to use cases in Local Zones as it supports only ALB, so the clear guidance will help to a successful orientation in this scenario. - **Additional info:** [0] Installing a cluster on AWS into an existing VPC / Requirements for using your VPC: https://docs.openshift.com/container-platform/4.10/installing/installing_aws/installing-aws-vpc.html#installation-custom-aws-vpc-requirements_installing-aws-vpc [1] Steps used to install from source: [Local Development](https://github.com/openshift/aws-load-balancer-operator#local-development) ~~~ # Building the Operand git clone https://github.com/openshift/aws-load-balancer-controller.git IMG=quay.io/$USER/aws-load-balancer-controller podman build -t $IMG -f Dockerfile.openshift podman push $IMG # Update the Operand image (RELATED_IMAGE_CONTROLLER) on `config/manager/manager.yaml` # Building the Operator export IMG=quay.io/$USER/aws-load-balancer-operator:latest make image-build image-push # Running the Operator oc new-project aws-load-balancer-operator oc apply -f hack/operator-credentials-request.yaml export IMG=quay.io/$USER/aws-load-balancer-operator:latest make deploy oc get all -n aws-load-balancer-operator ~~~ [2] Current work to use AWS Local Zones in OCP: https://issues.redhat.com/browse/RFE-2782
This is the expected behavior. The ALB operator requires that the VPC be tagged. Arjun will work on making this clearer in the documentation.
This issue is stale and has been closed because it has been open 90 days or more with no noted activity/comments in the last 60 days. If this issue is crucial and still needs resolution, please open a new jira issue and the engineering team will triage and prioritize accordingly.