1293634 – [georep+tiering]: Geo-Rep sync is poor if master volume is tiered

Bug 1293634 - [georep+tiering]: Geo-Rep sync is poor if master volume is tiered

Summary: [georep+tiering]: Geo-Rep sync is poor if master volume is tiered

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:	tier-interops
Depends On:
Blocks:	1268895
TreeView+	depends on / blocked

Reported:	2015-12-22 13:40 UTC by Rahul Hinduja
Modified:	2018-04-16 15:55 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	Sync performance for geo-replicated storage is reduced when the master volume is tiered, resulting in slower geo-rep performance on tiered volumes.
Clone Of:
Environment:
Last Closed:	2018-04-16 15:55:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Rahul Hinduja 2015-12-22 13:40:58 UTC

Description of problem:
=======================

Executed a case on normalvolume(4x2) and Tiered volume (HT: 2x2, CT:2x2). The time taken to complete the sync when it is tiered volume is 2-4 times more than the normal volume. 

Note: In both these cases total subvolumes are 4 and no promotion/demotions ever happened. Slave volume in both these cases was 4x2

Normal Volume: 4x2
==================

> Time at which I started creating Data: 04:53:33
> Time at which it completed creation of Data: 05:05:42
> Time at which the used size of master and slave volume became equal: 05:09:12
> Sleep 2 mins
> Calculate arequal: It matches between master and slave @ 5:20:53 {Arequal checksum took ~7 mins to finish}

Total Time Taken to complete this test from data creation on master to matchin arequal checksum at slave: ~30min

Tiered Volume: HT: 2x2 and CT: 2x2
==================================

> Time at which I started creating Data: 05:29:31
> Time at which it completed creation of Data: 05:51:04
> Time at which the used size of master and slave volume became equal: 06:09:20
> Sleep 2 mins
> Calculate arequal: It matches between master and slave @ 06:19:40 {Arequal checksum took ~8 mins to finish}

Total Time Taken to complete this test from data creation on master to arequal checksum match at slave: ~50min

Data For reference:
===================

<i> for i in {1..5}; do dd if=/dev/zero of=dd.$i bs=2M count=1024; done
<ii> for i in {1..5}; do cp -rf /etc etc.$i ; done

Can not confirm the exact percentage of degradation in syncing because with tiered volume the creation in itself has performance degradation. During creation the rsync can still happen to slave.

So in general from data creation to sync in normal volume took approx: 16 mins vs 40 mins with tiered volume


[root@localhost scripts]# gluster volume rebal master tier status
Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            0                    0                    in progress         
10.70.46.97          0                    0                    in progress         
10.70.46.93          0                    0                    in progress         
10.70.46.154         0                    0                    in progress         
Tiering Migration Functionality: master: success
[root@localhost scripts]# 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-12.el7rhgs.x86_64


How reproducible:
=================

Always

Comment 4 Aravinda VK 2016-02-08 05:56:42 UTC

Performance issues from Geo-rep side:

Entry operations handled by Cold Workers(worker belongs to cold bricks) only. Data synced from all the workers. Cold workers will get overloaded and Data sync may fail from Hot workers since entry will be created from Cold workers.

Possible fix: 
Handle Rsync errors and retries effectively (Patch sent to upstream for the same http://review.gluster.org/#/c/12856/)

Synchronization between Hot workers and Cold Workers, do not sync files from Hot before entry created on Slave from Cold workers.(Just an idea, design pending)

Performance issues due to tiering:
Geo-rep uses rsync to sync data from Master Volume to Slave Volume. For a given list of files Rsync will sync data from Master volume mount to Slave Volume mount. Read performance of Tiering may have affected the Sync performance.

Comment 6 Dan Lambright 2016-02-12 13:09:31 UTC

Hi Laura, You could add an additional sentence "As a consequence, geo-rep performance on tiered volumes is slower than with non tiered volumes". 

Those basic sentences seem to capture what the customer needs to know. My reading of Aravinda's summary from comment #4, is that incorporating additional low level engineering details into the release notes would not help the customer.

You could consider gathering some more information from the geo-rep team, etc.

1. how much slower is it? Does the degradation get worse depending on the hot/cold volume type or number of sub volumes?

2. does the degradation ever become significant enough to make geo-rep unusable?

Comment 10 Dan Lambright 2016-07-18 03:12:00 UTC

Per discussion with Milind, changing component to geo-rep. This should be tested with the latest patches in the release-3.8 branch. (see below).

>> I spoke with Aravinda regarding tiering + georep performance issues.
>> He said that some patches have been merged upstream to mitigate the
>> performance drop seen for tiered volumes. He insisted on getting the
>> performance benchmarked *before* any additional enhancements are
>> attempted.
>>
>> Having said this, he still has one recommendation: to synchronize hot
>> and cold tier georep worker processes w.r.t. entry creation by cold
>> tier worker followed by data sync by hot tier worker. This could be
>> attempted if the latest performance numbers seem unacceptable to QE.
>
> Should this be moved to the geo-rep group?

   yes, you could move this to the geo-rep group with a comment to test
   the performance with the latest patches on release-3.8 branch

Note You need to log in before you can comment on or make changes to this bug.