Bug 1399523 - The value of openshift_hosted_logging_elasticsearch_ops_pvc_prefix is the same with openshift_hosted_logging_elasticsearch_pvc_prefix in installer
Summary: The value of openshift_hosted_logging_elasticsearch_ops_pvc_prefix is the sam...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Jeff Cantrill
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-29 09:10 UTC by Gaoyun Pei
Modified: 2017-07-24 14:11 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-04-12 18:48:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 22:45:42 UTC

Description Gaoyun Pei 2016-11-29 09:10:25 UTC
Description of problem:
When setting openshift_hosted_logging_enable_ops_cluster=true additionally in logging deployment with dynamic pv, installer will fail at TASK [openshift_hosted_logging : Wait for component pods to be running] 

The logging-es deployer pod run into error status due to "Unable to mount volumes for pod "logging-es-puwie4np-4-x3dwz_logging"

Checked related code, found openshift_hosted_logging_ops_pvc_prefix and openshift_hosted_logging_pvc_prefix are both hardcoded with the same value, only one pvc named "logging-es1" created under logging project during installation.


Version-Release number of selected component (if applicable):
openshift-ansible-3.4.29-1.git.0.cdb211b.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1.Set the following options in ansible inventory file and start installation playbook
openshift_hosted_logging_deploy=true
openshift_hosted_logging_deployer_prefix=x.openshift.com/openshift3/
openshift_hosted_logging_deployer_version=3.4.0
openshift_hosted_logging_storage_kind=dynamic
openshift_hosted_logging_enable_ops_cluster=true


Actual results:
[root@gpei-test-master-etcd-1 ~]# oc get pod
NAME                              READY     STATUS             RESTARTS   AGE
logging-curator-1-ax9ju           0/1       CrashLoopBackOff   34         4h
logging-curator-ops-1-gkqyz       1/1       Running            0          4h
logging-deployer-va6vy            0/1       Completed          0          4h
logging-es-ops-p4ocsboo-1-gdixt   1/1       Running            0          4h
logging-es-puwie4np-1-deploy      0/1       Error              0          4h
logging-kibana-1-lnytb            2/2       Running            0          4h
logging-kibana-ops-1-r8uau        2/2       Running            0          4h

Start new deployment of logging-es-puwie4np
[root@gpei-test-master-etcd-1 ~]# oc describe pod logging-es-puwie4np-4-x3dwz
Name:			logging-es-puwie4np-4-x3dwz
Namespace:		logging
Security Policy:	restricted
Node:			gpei-test-node-registry-router-1/10.240.0.7
Start Time:		Mon, 28 Nov 2016 02:54:55 -0500
Labels:			component=es
			deployment=logging-es-puwie4np-4
			deploymentconfig=logging-es-puwie4np
			provider=openshift
Status:			Pending
IP:			
Controllers:		ReplicationController/logging-es-puwie4np-4
Containers:
  elasticsearch:
    Container ID:	
    Image:		x.openshift.com/openshift3/logging-elasticsearch:3.4.0
    Image ID:		
    Ports:		9200/TCP, 9300/TCP
    Limits:
      memory:	8Gi
    Requests:
      memory:		512Mi
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Volume Mounts:
      /elasticsearch/persistent from elasticsearch-storage (rw)
      /etc/elasticsearch/secret from elasticsearch (ro)
      /usr/share/java/elasticsearch/config from elasticsearch-config (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from aggregated-logging-elasticsearch-token-j8fb4 (ro)
    Environment Variables:
      NAMESPACE:		logging (v1:metadata.namespace)
      KUBERNETES_TRUST_CERT:	true
      SERVICE_DNS:		logging-es-cluster
      CLUSTER_NAME:		logging-es
      INSTANCE_RAM:		8G
      NODE_QUORUM:		1
      RECOVER_AFTER_NODES:	0
      RECOVER_EXPECTED_NODES:	1
      RECOVER_AFTER_TIME:	5m
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  elasticsearch:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	logging-elasticsearch
  elasticsearch-config:
    Type:	ConfigMap (a volume populated by a ConfigMap)
    Name:	logging-elasticsearch
  elasticsearch-storage:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	logging-es1
    ReadOnly:	false
  aggregated-logging-elasticsearch-token-j8fb4:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	aggregated-logging-elasticsearch-token-j8fb4
QoS Class:	Burstable
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From						SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----						-------------	--------	------		-------
  3m		3m		1	{default-scheduler }						Normal		Scheduled	Successfully assigned logging-es-puwie4np-4-x3dwz to gpei-test-node-registry-router-1
  1m		1m		1	{kubelet gpei-test-node-registry-router-1}			Warning		FailedMount	Unable to mount volumes for pod "logging-es-puwie4np-4-x3dwz_logging(f37bee08-b53f-11e6-bd3f-42010af00005)": timeout expired waiting for volumes to attach/mount for pod "logging-es-puwie4np-4-x3dwz"/"logging". list of unattached/unmounted volumes=[elasticsearch-storage]
  1m		1m		1	{kubelet gpei-test-node-registry-router-1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "logging-es-puwie4np-4-x3dwz"/"logging". list of unattached/unmounted volumes=[elasticsearch-storage]

[root@gpei-test-master-etcd-1 ~]# oc get pv
NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                 REASON    AGE
pvc-6a64b62c-b517-11e6-8a33-42010af00004   10Gi       RWO           Delete          Bound     logging/logging-es1             4h

[root@gpei-test-master-etcd-1 ~]# oc get pvc
NAME          STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGE
logging-es1   Bound     pvc-6a64b62c-b517-11e6-8a33-42010af00004   10Gi       RWO           4h

[root@gpei-test-master-etcd-1 ~]# oc describe pvc logging-es1
Name:		logging-es1
Namespace:	logging
StorageClass:	dynamic
Status:		Bound
Volume:		pvc-6a64b62c-b517-11e6-8a33-42010af00004
Labels:		app=logging-pvc-dynamic-template
		component=support
		logging-infra=support
		provider=openshift
Capacity:	10Gi
Access Modes:	RWO
No events.


Expected results:
logging-es and logging-es-ops pod should both have available pv and pods could be running.

Additional info:

Comment 1 Jeff Cantrill 2017-02-10 16:08:17 UTC
fixed in PR

Comment 2 Jeff Cantrill 2017-02-10 16:13:55 UTC
fixed in PR https://github.com/openshift/openshift-ansible/pull/3330

Comment 3 openshift-github-bot 2017-02-10 19:10:03 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/8d103f6bdea2a02d6e4769e4868dd398ecf2a60f
bug 1399523. Ops pvc should have different prefix from non-ops for openshift_logging

https://github.com/openshift/openshift-ansible/commit/0826250be3d0fe2b6ae7e68dcc975d9937a4100a
Merge pull request #3330 from jcantrill/bz_1399523_default_ops_prefix

bug 1399523. Ops pvc should have different prefix from non-ops for op…

Comment 5 Gaoyun Pei 2017-03-14 07:20:38 UTC
Set the following options in ansible inventory file and start installation playbook

openshift_hosted_logging_deploy=true
openshift_hosted_logging_deployer_prefix=x.openshift.com/openshift3/
openshift_hosted_logging_deployer_version=3.5.0
openshift_hosted_logging_storage_kind=dynamic
openshift_hosted_logging_enable_ops_cluster=true
openshift_hosted_loggingops_storage_kind=dynamic


After installation, check logging pod status:
[root@ip-172-18-13-197 ~]# oc get pod -n logging
NAME                              READY     STATUS    RESTARTS   AGE
logging-curator-1-fqzn9           0/1       Error     0          5m
logging-curator-ops-1-hgs3j       1/1       Running   1          5m
logging-es-ops-1ugo46xt-1-2qp81   1/1       Running   0          3m
logging-es-v9sg5e4a-1-f6rm1       1/1       Running   0          3m
logging-fluentd-1gphk             1/1       Running   0          3m
logging-fluentd-pdfmv             1/1       Running   0          3m
logging-kibana-1-r0pkp            2/2       Running   0          4m
logging-kibana-ops-1-njs63        2/2       Running   0          4m

[root@ip-172-18-13-197 ~]# oc get pvc -n logging
NAME               STATUS    VOLUME                                     CAPACITY   ACCESSMODES   AGE
logging-es-0       Bound     pvc-2d6f481c-0885-11e7-8b88-0e754ba34252   10Gi       RWO           4m
logging-es-ops-0   Bound     pvc-2f44ad33-0885-11e7-8b88-0e754ba34252   10Gi       RWO           4m

Comment 7 errata-xmlrpc 2017-04-12 18:48:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903


Note You need to log in before you can comment on or make changes to this bug.