Description of problem: Enable metrics and logging deployment with dynamic volume, start installation playbook. After ocp-3.5 cluster installation finished, metrics and logging pod were not running well. Version-Release number of selected component (if applicable): openshift-ansible-3.5.28-1.git.0.103513e.el7.noarch.rpm How reproducible: Always Steps to Reproduce: 1. Set the following options in ansible inventory, start installation playbook openshift_hosted_metrics_deploy=true openshift_hosted_metrics_deployer_prefix=x.openshift.com/openshift3/ openshift_hosted_metrics_deployer_version=3.5.0 openshift_hosted_metrics_storage_kind=dynamic openshift_hosted_logging_deploy=true openshift_hosted_logging_deployer_prefix=x.openshift.com/openshift3/ openshift_hosted_logging_deployer_version=3.5.0 openshift_hosted_logging_storage_kind=dynamic ansible-playbook -i inventory_file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml Actual results: [root@ip-172-18-11-42 ~]# oc get pod -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-39cbp 0/1 Pending 0 2h hawkular-metrics-b92p7 0/1 CrashLoopBackOff 29 2h heapster-0jbr2 0/1 Running 14 2h [root@ip-172-18-11-42 ~]# oc describe pod hawkular-cassandra-1-39cbp -n openshift-infra Name: hawkular-cassandra-1-39cbp Namespace: openshift-infra Security Policy: restricted Node: / Labels: metrics-infra=hawkular-cassandra name=hawkular-cassandra-1 type=hawkular-cassandra Status: Pending IP: Controllers: ReplicationController/hawkular-cassandra-1 Containers: hawkular-cassandra-1: Image: docker.io/openshift/origin-metrics-cassandra:latest Ports: 9042/TCP, 9160/TCP, 7000/TCP, 7001/TCP Command: /opt/apache-cassandra/bin/cassandra-docker.sh --cluster_name=hawkular-metrics --data_volume=/cassandra_data --internode_encryption=all --require_node_auth=true --enable_client_encryption=true --require_client_auth=true --keystore_file=/secret/cassandra.keystore --keystore_password_file=/secret/cassandra.keystore.password --truststore_file=/secret/cassandra.truststore --truststore_password_file=/secret/cassandra.truststore.password --cassandra_pem_file=/secret/cassandra.pem Limits: memory: 2G Requests: memory: 1G Readiness: exec [/opt/apache-cassandra/bin/cassandra-docker-ready.sh] delay=0s timeout=1s period=10s #success=1 #failure=3 Volume Mounts: /cassandra_data from cassandra-data (rw) /secret from hawkular-cassandra-secrets (rw) /var/run/secrets/kubernetes.io/serviceaccount from cassandra-token-9jx33 (ro) Environment Variables: CASSANDRA_MASTER: true CASSANDRA_DATA_VOLUME: /cassandra_data JVM_OPTS: -Dcassandra.commitlog.ignorereplayerrors=true POD_NAMESPACE: openshift-infra (v1:metadata.namespace) MEMORY_LIMIT: 2000000000 (limits.memory) CPU_LIMIT: node allocatable (limits.cpu) Conditions: Type Status PodScheduled False Volumes: cassandra-data: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: metrics-cassandra-1 ReadOnly: false hawkular-cassandra-secrets: Type: Secret (a volume populated by a Secret) SecretName: hawkular-cassandra-secrets cassandra-token-9jx33: Type: Secret (a volume populated by a Secret) SecretName: cassandra-token-9jx33 QoS Class: Burstable Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2h 30s 456 {default-scheduler } Warning FailedScheduling SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "metrics-cassandra-1", which is unexpected. [root@ip-172-18-11-42 ~]# oc get pvc -n openshift-infra NAME STATUS VOLUME CAPACITY ACCESSMODES AGE metrics-cassandra-1 Pending 2h [root@ip-172-18-11-42 ~]# oc describe pvc metrics-cassandra-1 -n openshift-infra Name: metrics-cassandra-1 Namespace: openshift-infra StorageClass: dynamic Status: Pending Volume: Labels: metrics-infra=hawkular-cassandra Capacity: Access Modes: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2h 10s 563 {persistentvolume-controller } Warning ProvisioningFailed cannot find volume plugin for alpha provisioning Pending 2h The same error for logging pvc [root@ip-172-18-11-42 ~]# oc get pvc -n logging NAME STATUS VOLUME CAPACITY ACCESSMODES AGE logging-es-0 Pending 2h [root@ip-172-18-11-42 ~]# oc describe pvc -n logging Name: logging-es-0 Namespace: logging StorageClass: dynamic Status: Pending Volume: Labels: logging-infra=support Capacity: Access Modes: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 2h 11s 597 {persistentvolume-controller } Warning ProvisioningFailed cannot find volume plugin for alpha provisioning Expected results: Additional info:
After looking to get further clarification on the error message seen above, I believe this is due to the lack of a cloud provider being used to provide a storage provisioner and not due to an error in the logging/metrics roles.
Gaoyun, Can you confirm that the inventory has a cloud provider defined and provide the entire inventory so that we may more easily reproduce?
Hi, Eric/Scott The previous env has been deleted, so I also couldn't find whether I did enable cloudprovider for it. But in my latest try with openshift-ansible-3.5.32-1.git.0.42cf266.el7.noarch.rpm, also couldn't reproduce this issue, logging and metrics with dynamic pv were deployed well. [root@ip-172-18-13-57 ~]# oc get pod -n logging NAME READY STATUS RESTARTS AGE logging-curator-1-lh386 1/1 Running 0 10m logging-es-37n9vqzk-1-x7hr8 1/1 Running 0 9m logging-fluentd-3bmx3 1/1 Running 0 8m logging-fluentd-8lff6 1/1 Running 0 8m logging-kibana-1-rb6pk 2/2 Running 0 9m [root@ip-172-18-13-57 ~]# oc get pvc -n logging NAME STATUS VOLUME CAPACITY ACCESSMODES AGE logging-es-0 Bound pvc-83c6dee9-0885-11e7-b678-0e0f089fb332 10Gi RWO 9m [root@ip-172-18-13-57 ~]# oc get pod -n openshift-infra NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-n3fjl 1/1 Running 0 14m hawkular-metrics-fwpdf 1/1 Running 0 14m heapster-0vb73 1/1 Running 0 13m [root@ip-172-18-13-57 ~]# oc get pvc -n openshift-infra NAME STATUS VOLUME CAPACITY ACCESSMODES AGE metrics-cassandra-1 Bound pvc-d62929ab-0884-11e7-b678-0e0f089fb332 10Gi RWO 14m Thanks for looking into this bug, mark this bug as verified on openshift-ansible-3.5.32-1.git.0.42cf266.el7.noarch.rpm
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0903