Description of problem: I ran ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml and it finished all ok but I see metrics pod failing: [root@master01 ~]# for prj in default logging openshift-infra ; do oc get pods -n $prj ; done NAME READY STATUS RESTARTS AGE docker-registry-2-vzw4h 1/1 Running 0 22m registry-console-1-cr9cc 1/1 Running 0 10m router-1-inwnw 1/1 Running 0 25m NAME READY STATUS RESTARTS AGE logging-curator-1-u3f1i 1/1 Running 0 14m logging-deployer-8mh4h 0/1 Completed 0 17m logging-es-7i4bhlpr-1-e6van 1/1 Running 0 14m logging-fluentd-132mv 1/1 Running 0 12m logging-fluentd-m1uxw 1/1 Running 0 12m logging-fluentd-tas3o 1/1 Running 0 12m logging-fluentd-wujxv 1/1 Running 0 12m logging-fluentd-x6dkn 1/1 Running 0 12m logging-fluentd-zlqz4 1/1 Running 0 12m logging-kibana-1-twf1a 2/2 Running 0 14m NAME READY STATUS RESTARTS AGE metrics-deployer-dh3qu 0/1 Error 0 27m [root@master01 ~]# oc describe pod metrics-deployer-dh3qu -n openshift-infra Name: metrics-deployer-dh3qu Namespace: openshift-infra Security Policy: anyuid Node: node01.example.com/192.168.122.201 Start Time: Tue, 24 Jan 2017 07:00:08 -0500 Labels: component=deployer metrics-infra=deployer provider=openshift Status: Failed IP: 10.1.1.2 Controllers: <none> Containers: deployer: Container ID: docker://8604f99338e20660c4193ebf5a2a2e309ad5f5e2853d8833411886f6bb0d9d14 Image: registry.access.redhat.com/openshift3/metrics-deployer:3.4.0 Image ID: docker-pullable://registry.access.redhat.com/openshift3/metrics-deployer@sha256:123825bc4576cbc4b2a699ccbc6e61666d9b5cb76a544104010298e9efbb1f7e Port: State: Terminated Reason: Error Exit Code: 255 Started: Tue, 24 Jan 2017 07:07:10 -0500 Finished: Tue, 24 Jan 2017 07:07:25 -0500 Ready: False Restart Count: 0 Volume Mounts: /etc/deploy from empty (rw) /secret from secret (ro) /var/run/secrets/kubernetes.io/serviceaccount from metrics-deployer-token-3e8am (ro) Environment Variables: PROJECT: openshift-infra (v1:metadata.namespace) POD_NAME: metrics-deployer-dh3qu (v1:metadata.name) IMAGE_PREFIX: registry.access.redhat.com/openshift3/ IMAGE_VERSION: 3.4.0 MASTER_URL: https://kubernetes.default.svc:443 MODE: deploy CONTINUE_ON_ERROR: false REDEPLOY: false IGNORE_PREFLIGHT: false USE_PERSISTENT_STORAGE: true DYNAMICALLY_PROVISION_STORAGE: false HAWKULAR_METRICS_HOSTNAME: metrics.example.com CASSANDRA_NODES: 1 CASSANDRA_PV_SIZE: 10Gi METRIC_DURATION: 7 USER_WRITE_ACCESS: false HEAPSTER_NODE_ID: nodename METRIC_RESOLUTION: 10s STARTUP_TIMEOUT: 500 Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: empty: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: secret: Type: Secret (a volume populated by a Secret) SecretName: metrics-deployer metrics-deployer-token-3e8am: Type: Secret (a volume populated by a Secret) SecretName: metrics-deployer-token-3e8am QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 27m 27m 1 {default-scheduler } Normal Scheduled Successfully assigned metrics-deployer-dh3qu to node01.example.com 22m 22m 1 {kubelet node01.example.com} spec.containers{deployer} Normal Pulling pulling image "registry.access.redhat.com/openshift3/metrics-deployer:3.4.0" 20m 20m 1 {kubelet node01.example.com} spec.containers{deployer} Normal Pulled Successfully pulled image "registry.access.redhat.com/openshift3/metrics-deployer:3.4.0" 20m 20m 1 {kubelet node01.example.com} spec.containers{deployer} Normal Created Created container with docker id 8604f99338e2; Security:[seccomp=unconfined] 20m 20m 1 {kubelet node01.example.com} spec.containers{deployer} Normal Started Started container with docker id 8604f99338e2 [root@master01 ~]# /etc/ansible/hosts contains: openshift_hosted_metrics_deploy=true openshift_hosted_metrics_storage_kind=nfs openshift_hosted_metrics_storage_access_modes=['ReadWriteOnce'] openshift_hosted_metrics_storage_host=nfs01.example.com openshift_hosted_metrics_storage_nfs_directory=/srv/nfs openshift_hosted_metrics_storage_nfs_options='*(rw,root_squash)' openshift_hosted_metrics_storage_volume_name=metrics openshift_hosted_metrics_storage_volume_size=10Gi openshift_hosted_metrics_public_url=https://metrics.example.com/hawkular/metrics Version-Release number of selected component (if applicable): openshift-ansible-playbooks-3.4.44-1.git.0.efa61c6.el7.noarch
[root@master01 ~]# oc logs metrics-deployer-dh3qu -n openshift-infra | tail -n 20 + '[' -n 1 ']' + oc config use-context deployer-context switched to context "deployer-context". + case $deployer_mode in + '[' false '!=' true ']' + validate_preflight + set +x PREFLIGHT CHECK FAILED ======================== validate_master_accessible: unable to access master url https://kubernetes.default.svc:443 See the error from 'curl https://kubernetes.default.svc:443' below for details: curl: (28) timed out before SSL handshake Deployment has been aborted prior to starting, as these failures often indicate fatal problems. Please evaluate any error messages above and determine how they can be addressed. To ignore this validation failure and continue, specify IGNORE_PREFLIGHT=true. PREFLIGHT CHECK FAILED
3.7 may add the ability to determine template success. If so we'll update the installer to leverage that feature.
Components should now report success but unless they're critical they don't immediately halt the installation process and you'll receive an output detailing which components failed at the end of the run so you can investigate