Bug 1302320 - [georep+Tier]: File is synced to slave with permissions "-r--r-S-wT"
[georep+Tier]: File is synced to slave with permissions "-r--r-S-wT"
Status: ASSIGNED
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
3.1
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Mohammed Rafi KC
storage-qa-internal@redhat.com
: ZStream
Depends On:
Blocks: 1268895
  Show dependency treegraph
 
Reported: 2016-01-27 08:40 EST by Rahul Hinduja
Modified: 2017-11-18 13:03 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
During file promotion, the rebalance operation sets the sticky bit and suid/sgid bit. Normally, it removes these bits when the migration is complete. If readdirp is called on a file before migration completes, these bits are not removed, and remain applied on the client. This means that, if rsync happens while the bits are applied, the bits remain applied to the file as it is synced to the destination, impairing accessibility on the destination. This can happen in any geo-replicated configuration, but the likelihood increases with tiering because the rebalance process is continuous.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rahul Hinduja 2016-01-27 08:40:50 EST
Description of problem:
========================

During georeplication and tiering interop testing, found a file in slave which is synced with permission "-r--r-S-wT." . 

On Master:
==========

[root@dj scripts]# ll /mnt/master/thread5/level08/level18/
total 110
--w--wxrw-. 1 22974 54638 17345 Jan 27  2016 56a7f56c%%237OHICJOQ
--w-rwx-w-. 1  3425 18861 12836 Jan 27  2016 56a7f56c%%7RZMAIM1QZ
-r--r---w-. 1 39857 50648 11631 Jan 27  2016 56a7f56c%%A0LGBBEK3A
----r---wx. 1 41939 34071 19524 Jan 27  2016 56a7f56c%%E9YSMFSZTN
-r-xrwxr--. 1 10593  2812 13769 Jan 27  2016 56a7f56c%%SQXKUF0JJZ
d---r-xrwx. 4 25608  7913  8456 Jan 27  2016 level28
drwxr-xr-x. 2 root  root    213 Jan 27  2016 symlink_to_files
[root@dj scripts]# 

On Slave:
=========

[root@dj scripts]# ll /mnt/slave/thread5/level08/level18/
total 75
--w--wxrw-. 1 22974 54638 17345 Jan 27  2016 56a7f56c%%237OHICJOQ
--w-rwx-w-. 1  3425 18861 12836 Jan 27  2016 56a7f56c%%7RZMAIM1QZ
-r--r-S-wT. 1 39857 50648 11631 Jan 27  2016 56a7f56c%%A0LGBBEK3A
----r---wx. 1 41939 34071 19524 Jan 27  2016 56a7f56c%%E9YSMFSZTN
-r-xrwxr--. 1 10593  2812 13769 Jan 27  2016 56a7f56c%%SQXKUF0JJZ
d---r-xrwx. 4 25608  7913   528 Jan 27  2016 level28
drwxr-xr-x. 2 root  root    225 Jan 27  2016 symlink_to_files
[root@dj scripts]#

During promote of a file, the tiering shows sticky bit set from the glusterfs mount. This could have been picked by rsync during sync with preserve permission and synced to slave with sticky bit set. 

Appending a file to be picked up again for sync resolves the issue as it syncs the latest permission.

Unit test on local system of dev confirms the following:

During promotion, the sticky bit is shown on mount and during demote it doesn't.

during promotion on mount (ll on directory)
-rw-r-Sr-T. 1 root root 701000005 Jan 27 18:47 file1
[root@rafi 0]# ll
total 684571
-rw-r-Sr-T. 1 root root 701000005 Jan 27 18:47 file1
[root@rafi 0]# ll
total 684571
-rw-r-Sr-T. 1 root root 701000005 Jan 27 18:47 file1
[root@rafi 0]# ll
total 684571
-rw-r-Sr-T. 1 root root 701000005 Jan 27 18:47 file1
[root@rafi 0]# ls -lrt
total 684571
-rw-r-Sr-T. 1 root root 701000005 Jan 27 18:47 file1


during demotion on mount (ll on directory)


[root@rafi 0]# ll
total 684571
-rw-r--r--. 1 root root 701000000 Jan 27 18:46 file1
[root@rafi 0]# ll
total 684571
-rw-r--r--. 1 root root 701000000 Jan 27 18:46 file1
[root@rafi 0]# ll
total 684571
-rw-r--r--. 1 root root 701000000 Jan 27 18:46 file1
[root@rafi 0]# ll
total 684571

Raising but against geo-replication as the consumer is rsync and use case is failed in geo-replication.

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-17.el7rhgs.x86_64


How reproducible:
=================

Its a race, a file needs to be picked up for syncing from master to slave during tiering promotion


Steps to Reproduce:
===================
Found during testing of geo-replication different fops to be synced to slave while promote demotes were in progress on master


Actual results:
===============

Files is synced with S bit set

Expected results:
=================

Files should sync as regular file without any S bit set
Comment 6 Mohammed Rafi KC 2016-02-12 07:46:00 EST
Minor comments, Please feel free to edit as you wish.

During a file promotion , rebalance process sets sticky bit and suid/sgid bit,and strip out these bits when it hands the stat to the client. It removes these bits when it completes the migration. But, when a file is migrating and if you tried to list the file using readdirp call, we missed out the striping part and the two mentioned flags will be given to clients.

As a consequence of the above mentioned problem, If rsync happens while the bits are applied, the bits remain applied to the file as it is synced to the destination, impairing accessibility on the destination. This can happen in any geo-replicated configuration, but the likelihood increases with tiering because the rebalance process is continuous.
Comment 8 Mohammed Rafi KC 2016-02-15 00:07:04 EST
Looks good to me.

Note You need to log in before you can comment on or make changes to this bug.