Bug 1430625 - Metrics and logging deployment with dynamic pv failed
Summary: Metrics and logging deployment with dynamic pv failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: ewolinet
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks: 1426536
TreeView+ depends on / blocked
 
Reported: 2017-03-09 07:17 UTC by Gaoyun Pei
Modified: 2017-07-24 14:11 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-04-12 19:03:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 22:45:42 UTC

Description Gaoyun Pei 2017-03-09 07:17:59 UTC
Description of problem:
Enable metrics and logging deployment with dynamic volume, start installation playbook.
After ocp-3.5 cluster installation finished, metrics and logging pod were not running well.

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.28-1.git.0.103513e.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1. Set the following options in ansible inventory, start installation playbook
openshift_hosted_metrics_deploy=true
openshift_hosted_metrics_deployer_prefix=x.openshift.com/openshift3/
openshift_hosted_metrics_deployer_version=3.5.0
openshift_hosted_metrics_storage_kind=dynamic

openshift_hosted_logging_deploy=true
openshift_hosted_logging_deployer_prefix=x.openshift.com/openshift3/
openshift_hosted_logging_deployer_version=3.5.0
openshift_hosted_logging_storage_kind=dynamic


ansible-playbook -i inventory_file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml


Actual results:
[root@ip-172-18-11-42 ~]# oc get pod -n openshift-infra
NAME                         READY     STATUS             RESTARTS   AGE
hawkular-cassandra-1-39cbp   0/1       Pending            0          2h
hawkular-metrics-b92p7       0/1       CrashLoopBackOff   29         2h
heapster-0jbr2               0/1       Running            14         2h

[root@ip-172-18-11-42 ~]# oc describe pod hawkular-cassandra-1-39cbp -n openshift-infra
Name:			hawkular-cassandra-1-39cbp
Namespace:		openshift-infra
Security Policy:	restricted
Node:			/
Labels:			metrics-infra=hawkular-cassandra
			name=hawkular-cassandra-1
			type=hawkular-cassandra
Status:			Pending
IP:			
Controllers:		ReplicationController/hawkular-cassandra-1
Containers:
  hawkular-cassandra-1:
    Image:	docker.io/openshift/origin-metrics-cassandra:latest
    Ports:	9042/TCP, 9160/TCP, 7000/TCP, 7001/TCP
    Command:
      /opt/apache-cassandra/bin/cassandra-docker.sh
      --cluster_name=hawkular-metrics
      --data_volume=/cassandra_data
      --internode_encryption=all
      --require_node_auth=true
      --enable_client_encryption=true
      --require_client_auth=true
      --keystore_file=/secret/cassandra.keystore
      --keystore_password_file=/secret/cassandra.keystore.password
      --truststore_file=/secret/cassandra.truststore
      --truststore_password_file=/secret/cassandra.truststore.password
      --cassandra_pem_file=/secret/cassandra.pem
    Limits:
      memory:	2G
    Requests:
      memory:	1G
    Readiness:	exec [/opt/apache-cassandra/bin/cassandra-docker-ready.sh] delay=0s timeout=1s period=10s #success=1 #failure=3
    Volume Mounts:
      /cassandra_data from cassandra-data (rw)
      /secret from hawkular-cassandra-secrets (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cassandra-token-9jx33 (ro)
    Environment Variables:
      CASSANDRA_MASTER:		true
      CASSANDRA_DATA_VOLUME:	/cassandra_data
      JVM_OPTS:			-Dcassandra.commitlog.ignorereplayerrors=true
      POD_NAMESPACE:		openshift-infra (v1:metadata.namespace)
      MEMORY_LIMIT:		2000000000 (limits.memory)
      CPU_LIMIT:		node allocatable (limits.cpu)
Conditions:
  Type		Status
  PodScheduled 	False 
Volumes:
  cassandra-data:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	metrics-cassandra-1
    ReadOnly:	false
  hawkular-cassandra-secrets:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	hawkular-cassandra-secrets
  cassandra-token-9jx33:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	cassandra-token-9jx33
QoS Class:	Burstable
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  2h		30s		456	{default-scheduler }			Warning		FailedScheduling	SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "metrics-cassandra-1", which is unexpected.


[root@ip-172-18-11-42 ~]# oc get pvc -n openshift-infra
NAME                  STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
metrics-cassandra-1   Pending                                      2h


[root@ip-172-18-11-42 ~]# oc describe pvc metrics-cassandra-1 -n openshift-infra
Name:		metrics-cassandra-1
Namespace:	openshift-infra
StorageClass:	dynamic
Status:		Pending
Volume:		
Labels:		metrics-infra=hawkular-cassandra
Capacity:	
Access Modes:	
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  2h		10s		563	{persistentvolume-controller }			Warning		ProvisioningFailed	cannot find volume plugin for alpha provisioning
   Pending                                      2h


The same error for logging pvc 
[root@ip-172-18-11-42 ~]# oc get pvc -n logging
NAME           STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
logging-es-0   Pending                                      2h

[root@ip-172-18-11-42 ~]# oc describe pvc -n logging
Name:		logging-es-0
Namespace:	logging
StorageClass:	dynamic
Status:		Pending
Volume:		
Labels:		logging-infra=support
Capacity:	
Access Modes:	
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  2h		11s		597	{persistentvolume-controller }			Warning		ProvisioningFailed	cannot find volume plugin for alpha provisioning


Expected results:


Additional info:

Comment 2 ewolinet 2017-03-10 15:26:00 UTC
After looking to get further clarification on the error message seen above, I believe this is due to the lack of a cloud provider being used to provide a storage provisioner and not due to an error in the logging/metrics roles.

Comment 3 Scott Dodson 2017-03-13 20:01:26 UTC
Gaoyun, Can you confirm that the inventory has a cloud provider defined and provide the entire inventory so that we may more easily reproduce?

Comment 4 Gaoyun Pei 2017-03-14 07:35:29 UTC
Hi, Eric/Scott

The previous env has been deleted, so I also couldn't find whether I did enable cloudprovider for it. But in my latest try with openshift-ansible-3.5.32-1.git.0.42cf266.el7.noarch.rpm, also couldn't reproduce this issue, logging and metrics with dynamic pv were deployed well.

[root@ip-172-18-13-57 ~]# oc get pod -n logging
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-lh386       1/1       Running   0          10m
logging-es-37n9vqzk-1-x7hr8   1/1       Running   0          9m
logging-fluentd-3bmx3         1/1       Running   0          8m
logging-fluentd-8lff6         1/1       Running   0          8m
logging-kibana-1-rb6pk        2/2       Running   0          9m
[root@ip-172-18-13-57 ~]# oc get pvc -n logging
NAME           STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGE
logging-es-0   Bound     pvc-83c6dee9-0885-11e7-b678-0e0f089fb332   10Gi       RWO           9m

[root@ip-172-18-13-57 ~]# oc get pod -n openshift-infra
NAME                         READY     STATUS    RESTARTS   AGE
hawkular-cassandra-1-n3fjl   1/1       Running   0          14m
hawkular-metrics-fwpdf       1/1       Running   0          14m
heapster-0vb73               1/1       Running   0          13m
[root@ip-172-18-13-57 ~]# oc get pvc -n openshift-infra
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGE
metrics-cassandra-1   Bound     pvc-d62929ab-0884-11e7-b678-0e0f089fb332   10Gi       RWO           14m

Thanks for looking into this bug, mark this bug as verified on openshift-ansible-3.5.32-1.git.0.42cf266.el7.noarch.rpm

Comment 6 errata-xmlrpc 2017-04-12 19:03:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903


Note You need to log in before you can comment on or make changes to this bug.