1426548 – Openshift Logging ElasticSearch FSLocks when using GlusterFS storage backend

Bug 1426548 - Openshift Logging ElasticSearch FSLocks when using GlusterFS storage backend

Summary: Openshift Logging ElasticSearch FSLocks when using GlusterFS storage backend

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	CNS-deployment
Sub Component:
Version:	cns-3.4
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Michael Adam
QA Contact:	krishnaram Karthick
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1543779 1573420 1622458 OCS-3.11.1-devel-triage-done 1642792
TreeView+	depends on / blocked

Reported:	2017-02-24 09:22 UTC by Takeshi Larsson
Modified:	2021-12-10 14:56 UTC (History)
CC List:	22 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-10-24 12:50:07 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1318493	0	high	CLOSED	Introduce ctime-xlator to return correct (client-side set) ctime	2021-02-22 00:41:40 UTC

Description Takeshi Larsson 2017-02-24 09:22:50 UTC

Description of problem:
When deploying logging using 3.4.0 images on openshift 3.4.1.5 using GlusterFS volumes as soon as the elasticsearch containers have managed to initilize with it peers if it has any, attempts to create some shards.

Log: https://paste.fedoraproject.org/paste/kKClaBFb0xj86Hml-bHI0V5M1UNdIGYhyRLivL9gydE=

This attempt seems to lock the FS and then nothing works.

Elasticsearch logging deployment works for 3.3.1~ image when glusterfs volumes has the following set:
performance.write-behind off
performance.quick-read off
performance.readdir-ahead off

the same options above were set on the PV for 3.4.0 logging deployment.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Create a PV for es-logging container on GlusterFS backend
2. Set the performance configuration for the logging volume in glusterfs as specified above and proven working for 3.3.1
3. Deploy logging on cluster making to configure it to use the pv we created for it
4. Look at pod logs for elasticsearch. Try to access kibana, kibana will say that it can not contact elasticsearch cluster
5. Attempt to curl the logging es service on port 9000, it will say "Searchguard not initialized.."



Actual results:
FSLocks on write


Expected results:
To deploy logging successfully.

Additional info:

Comment 1 Takeshi Larsson 2017-02-24 09:28:58 UTC

Another person experiencing the same issue..

https://forums.rancher.com/t/glusterfs-and-elasticsearch/2293

Comment 2 Takeshi Larsson 2017-02-27 13:28:05 UTC

Hi, adding some text: Just found out that the docs in 3.4 now even specifies the following: "Using NFS storage as a volume or a persistent volume (or via NAS such as Gluster) is not supported for Elasticsearch storage"

So I guess there is not much we can do except run with localstorage in that case?

Comment 3 Sayan Saha 2017-02-27 19:44:37 UTC

Yes. Local storage is the way to go for now. We'll re-test RHGS capabilities once we have iSCSI support for RWO workloads in a few months time.

Comment 5 Takayoshi Kimura 2017-02-28 00:24:53 UTC

This happens because the ES compares ctime to check the lock file unchanged:

https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/NativeFSLockFactory.java

In GlusterFS, it returns ctime from one of the multiple backend bricks so the ctime varies:

https://bugzilla.redhat.com/show_bug.cgi?id=1318493

As a result, ES believes the file is changed by someone else.

Comment 6 Sayan Saha 2017-02-28 12:31:50 UTC

Note that ES recommends using local or Direct Attached Storage with SSDs for backing storage.

Comment 9 Jeff Cantrill 2017-03-10 19:46:34 UTC

Moving to target 3.6 based on #8

Comment 26 Rubin Simons 2019-01-24 12:41:15 UTC

This should not be closed-wontfix imho. GlusterFS seems to have to work on this as evident here:

1. https://github.com/gluster/glusterfs/issues/208
2. https://github.com/gluster/glusterfs/issues/517
3. https://bugzilla.redhat.com/show_bug.cgi?id=1318493

Note You need to log in before you can comment on or make changes to this bug.

akhakhar
anli
annair
aos-bugs
bchilds
bkunal
dmoessne
hchiramm
jarrpa
jnordell
madam
me
myllynen
pdwyer
pprakash
rcyriac
rhs-bugs
rreddy
rtalur
ssaha
tkimura
vinug