Description of problem: During installation of a cluster with Azure as the cloud provider the nodes labels are not available in time so the installation fails because it cannot find the node selector of infra=true. The issue resolves itself if you run the cluster deployment a second time which leads me to believe the nodes may not be available yet. Version-Release number of selected component (if applicable): commit 67df6c8380f1ee2f27d6b234cff882d77efd723e How reproducible: Perform an installation with Azure as the cloud provider Steps to Reproduce: 1. Deploy Azure resources 2. Wait until the PLAY [Configure Cluster Monitoring Operator] steps 3. Failure Actual results: TASK [openshift_cluster_monitoring_operator : include_tasks] ******************* included: /var/lib/jenkins/workspace/OCP on Azure - Deployment/azure/openshift-ansible/roles/openshift_cluster_monitoring_operator/tasks/install.yaml for ocp-master-1 TASK [openshift_control_plane : Retrieve list of schedulable nodes matching selector] *** ok: [ocp-master-1] TASK [openshift_control_plane : Ensure that Cluster Monitoring Operator has nodes to run on] *** fatal: [ocp-master-1]: FAILED! => { "assertion": false, "changed": false, "evaluated_to": false, "msg": "No schedulable nodes found matching node selector for Cluster Monitoring Operator - 'node-role.kubernetes.io/infra=true'" } Expected results: Successful installation Additional info: [OSEv3:children] masters etcd nodes [OSEv3:vars] ansible_ssh_user=cloud-user ansible_become=true #311 openshift_cloudprovider_kind=azure openshift_cloudprovider_azure_client_id=redacted openshift_cloudprovider_azure_client_secret=Redacted openshift_cloudprovider_azure_tenant_id=redacted openshift_cloudprovider_azure_subscription_id=redacted openshift_cloudprovider_azure_resource_group=redacted openshift_cloudprovider_azure_location=eastus openshift_cloudprovider_azure_availability_set_name=ocp-app-instances openshift_cloudprovider_azure_security_group_name=node-nsg openshift_cloudprovider_azure_cloud=AzurePublicCloud openshift_cloudprovider_azure_vnet_name=openshiftvnet openshift_release=v3.11 openshift_docker_additional_registries=registry.reg-aws.openshift.com:443 oreg_url=registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version} oreg_auth_user=redacted oreg_auth_password=redacted openshift_disable_check=memory_availability,disk_availability,docker_image_availability #310end openshift_master_api_port=443 openshift_master_console_port=443 openshift_hosted_router_replicas=3 openshift_hosted_registry_replicas=1 openshift_master_cluster_method=native openshift_master_cluster_hostname=openshift-master.redacted openshift_master_cluster_public_hostname=openshift-master.redacted openshift_master_default_subdomain=apps.redacted deployment_type=openshift-enterprise openshift_master_identity_providers=[{'name': 'google', 'challenge': 'false', 'login': 'true', 'kind': 'GoogleIdentityProvider', 'mapping_method': 'claim', 'clientID': 'redacted.apps.googleusercontent.com', 'clientSecret': 'redacted', 'hostedDomain': 'redacted'}] networkPluginName=redhat/ovs-networkpolicy openshift_examples_modify_imagestreams=true openshift_storage_glusterfs_image=registry.access.redhat.com/rhgs3/rhgs-server-rhel7 openshift_storage_glusterfs_heketi_image=registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7 # Do not uninstall service catalog until post installation. Needs storage class object openshift_enable_service_catalog=true openshift_metrics_install_metrics=true openshift_metrics_storage_kind=dynamic openshift_metrics_storage_volume_size=25Gi openshift_logging_install_logging=false #openshift_logging_es_pvc_dynamic=true #openshift_logging_es_pvc_size=30Gi # Setup azure blob registry storage openshift_hosted_registry_storage_kind=object openshift_hosted_registry_storage_azure_blob_accountkey=redacted openshift_hosted_registry_storage_provider=azure_blob openshift_hosted_registry_storage_azure_blob_accountname=redacted openshift_hosted_registry_storage_azure_blob_container=registry openshift_hosted_registry_storage_azure_blob_realm=core.windows.net [masters] ocp-master-1 ocp-master-2 ocp-master-3 [etcd] ocp-master-1 ocp-master-2 ocp-master-3 [nodes] ocp-master-1 openshift_node_group_name="node-config-master" openshift_hostname=ocp-master-1 ocp-master-2 openshift_node_group_name="node-config-master" openshift_hostname=ocp-master-2 ocp-master-3 openshift_node_group_name="node-config-master" openshift_hostname=ocp-master-3 ocp-infra-1 openshift_node_group_name="node-config-infra" openshift_hostname=ocp-infra-1 ocp-infra-2 openshift_node_group_name="node-config-infra" openshift_hostname=ocp-infra-2 ocp-infra-3 openshift_node_group_name="node-config-infra" openshift_hostname=ocp-infra-3 ocp-app-1 openshift_node_group_name="node-config-compute" openshift_hostname=ocp-app-1 Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
Marking as a dupe of 1628208 unless you feel there's something unique to Azure here, at first glance it doesn't seem that way. *** This bug has been marked as a duplicate of bug 1628208 ***