Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1511110

Summary:	Logging deployments are not maintaining the association of DC to PVC claim for Elasticsearch causing deployment failures
Product:	OpenShift Container Platform	Reporter:	Peter Portante <pportant>
Component:	Logging	Assignee:	ewolinet
Status:	CLOSED ERRATA	QA Contact:	Anping Li <anli>
Severity:	high	Docs Contact:
Priority:	high
Version:	3.7.0	CC:	aos-bugs, ewolinet, rmeggins
Target Milestone:	---	Keywords:	OpsBlocker
Target Release:	3.7.0
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:	This fixes a regression introduced with 3.7 regarding reusing the pvc that was previously specified within a DC	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-18 13:23:26 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Peter Portante 2017-11-08 17:28:25 UTC

On starter-ca-central-1 an upgrade of logging from an earlier version of 3.7 (v3.7.0-0.143.7) to v3.7.0-0.178.2 resulted in ES pods not able to start due to "unable to attach volume errors":

  Multi-Attach error for volume "pvc-61e094bc-b375-11e7-b95b-02d8407159d1"
  Volume is already exclusively attached to one node and can't be attached
  to another.

  kubelet, ip-172-31-19-199.ca-central-1.compute.internal   Unable to mount
  volumes for pod "logging-es-data-master-m5d1xc5i-11-8h4v7_logging
  (dd07bb1b-c493-11e7-84d1-02d8407159d1)": timeout expired waiting for
  volumes to attach/mount for pod "logging"/
  "logging-es-data-master-m5d1xc5i-11-8h4v7". list of unattached/unmounted
  volumes=[elasticsearch-storage]


Looking through the existing pods that are running, we see a DC to PVC claim name as follows:

# oc describe pod -l component=es | grep -E "(ClaimName:|^Name:)"
Name:		logging-es-data-master-m5d1xc5i-4-gdm6q
    ClaimName:	logging-es-2
Name:		logging-es-data-master-wo54khfh-4-fxchw
    ClaimName:	logging-es-1
Name:		logging-es-data-master-yo5htett-4-qwg8d
    ClaimName:	logging-es-0

However, the newly updated DCs have changed the claim names associated with each DC:

# oc describe dc -l component=es | grep -E "(ClaimName:|^Name:)"
Name:		logging-es-data-master-m5d1xc5i
    ClaimName:	logging-es-0
Name:		logging-es-data-master-wo54khfh
    ClaimName:	logging-es-2
Name:		logging-es-data-master-yo5htett
    ClaimName:	logging-es-1

When we updated the DCs to restore the proper association of PVC to DC, logging started working just fine.

Comment 1 Rich Megginson 2017-11-08 17:37:03 UTC

Eric, is this related to what you are working on?

Comment 2 ewolinet 2017-11-08 17:48:33 UTC

The logic to maintain the same PVC claim for a DC should be in openshift-ansible 3.7.0-0.192.0

Comment 3 Anping Li 2017-11-09 11:00:29 UTC

The claim was preserved during upgrade for openshift-ansible-3.7.0-0.198.1

1. claimname for v3.6.

oc describe pod -l component=es | grep -E "(ClaimName:|^Name:)"
Name:			logging-es-data-master-aglrm65m-1-qlq5z
    ClaimName:	logging-es-2
Name:			logging-es-data-master-ftcbo50f-1-ft3zp
    ClaimName:	logging-es-0
Name:			logging-es-data-master-u1j0vpoe-1-zmm01
    ClaimName:	logging-es-1

2. Upgrade to v3.7

# oc describe pod -l component=es | grep -E "(ClaimName:|^Name:)"
Name:			logging-es-data-master-aglrm65m-2-2kxmw
    ClaimName:	logging-es-2
Name:			logging-es-data-master-ftcbo50f-2-nzltr
    ClaimName:	logging-es-0
Name:			logging-es-data-master-u1j0vpoe-2-rpksq
    ClaimName:	logging-es-1

Comment 6 errata-xmlrpc 2017-12-18 13:23:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3464