1474380 – [geo-rep]: few of the self healed hardlinks on master did not sync to slave

Bug 1474380 - [geo-rep]: few of the self healed hardlinks on master did not sync to slave

Summary: [geo-rep]: few of the self healed hardlinks on master did not sync to slave

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	geo-replication
Sub Component:
Version:	rhgs-3.3
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.3.0
Assignee:	Kotresh HR
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1417151 1475308 1476208 1476212
TreeView+	depends on / blocked

Reported:	2017-07-24 13:46 UTC by Rahul Hinduja
Modified:	2021-03-11 15:29 UTC (History)
CC List:	11 users (show)
Fixed In Version:	glusterfs-3.8.4-37
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1475308 (view as bug list)
Environment:
Last Closed:	2017-09-21 05:04:21 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:2774	0	normal	SHIPPED_LIVE	glusterfs bug fix and enhancement update	2017-09-21 08:16:29 UTC

Description Rahul Hinduja 2017-07-24 13:46:01 UTC

Description of problem:
=======================

In the following scenario, the sync of hardlinks do not happen to slave. 

Scenario 1:

1. Create geo-rep between master and slave
2. Mount the volume
3. Create a file (file1)
4. Let the file sync to slave
5. kill one set of replica for a subvolume containing file1
6. create a hardlink of file1 (ln file1 file2).=> Ensure that the file2 hashes to the same subvolume of file1
7. Start the master volume forcefully to heal file2 . Wait for heal to happen
8. Kill the other set of the replica (than the step 5)
9. Start the geo-replication

In the above scenario the hardlinks are not synced to slave and there are no errors. 

Scenario 2:

Step 1 to Step 5 remains same
6. create a hardlink of file1 (ln file1 file2).=> Ensure that the file2 hashes to the different subvolume of file1
Step 7 to Step 8 remains same

In this scenario, sync happens as follows:
   a. If both the bricks active are (selfhealed bricks) which has recoreded MKNOD. Sync happens.
   b. If the self healed brick containing MKNOD for sticky bit file becomes PASSIVE, hardlinks are not synced. 


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-geo-replication-3.8.4-35.el7rhgs.x86_64


How reproducible:
=================

Always with the above steps.

Comment 5 Atin Mukherjee 2017-07-26 12:40:06 UTC

upstream patch : https://review.gluster.org/17880

Comment 6 Atin Mukherjee 2017-07-28 12:23:23 UTC

upstream 3.12 patch : https://review.gluster.org/17905
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/113790/

Comment 8 Rahul Hinduja 2017-08-06 16:18:15 UTC

Verified with build: glusterfs-geo-replication-3.8.4-37.el7rhgs.x86_64

One additional step between 4 and 5:

=> Stop the geo-replication

Validated both the scenario mentioned in the description, the hardlinks are properly synced with 3.8.4-37 build..

One of the history changelog processed between builds:

 .history/.processed for the active brick

[root@dhcp43-168 .processed]# cat CHANGELOG.1502033284
E ce3d230f-59ba-4cd6-8c55-b0fe8e2dd62a MKNOD 33188 0 0 00000000-0000-0000-0000-000000000001%2Fnew_hardlink_file_to_file1
[root@dhcp43-168 .processed]# 

^^ Did not succeed to sync to slave with build glusterfs-geo-replication-3.8.4-18.6.el7rhgs.x86_64

[root@dhcp42-74 .processed]# cat CHANGELOG.1502034691
E 6b7f3186-9779-43cc-95a0-aa952e7fab51 MKNOD 33188 0 0 00000000-0000-0000-0000-000000000001%2Fnew_hardlink_file_to_file1
[root@dhcp42-74 .processed]# 

^^ Succeed to sync to slave with build glusterfs-geo-replication-3.8.4-37.el7rhgs.x86_64

Moving this bug to verified state. Further hardlink during heal scenarios will be carried and if needed will be tracked by different bugs.

Comment 16 errata-xmlrpc 2017-09-21 05:04:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Note You need to log in before you can comment on or make changes to this bug.