Bug 1171688 - AFR+SNAPSHOT: Read operation on file in USS is from wrong source brick
Summary: AFR+SNAPSHOT: Read operation on file in USS is from wrong source brick
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: snapshot
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Rahul Hinduja
URL:
Whiteboard: USS
Depends On:
Blocks: 1174187
TreeView+ depends on / blocked
 
Reported: 2014-12-08 12:08 UTC by Anil Shah
Modified: 2018-04-16 16:03 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1174187 (view as bug list)
Environment:
Last Closed: 2018-04-16 16:03:37 UTC
Embargoed:


Attachments (Terms of Use)

Description Anil Shah 2014-12-08 12:08:55 UTC
Description of problem:

Read operation in USS files, show the contents of wrong source brick 

Version-Release number of selected component (if applicable):

[root@rhsauto001 arequal]# rpm -qa | grep glusterfs
samba-glusterfs-3.6.509-169.1.el6rhs.x86_64
glusterfs-cli-3.6.0.36-1.el6rhs.x86_64
glusterfs-api-3.6.0.36-1.el6rhs.x86_64
glusterfs-3.6.0.36-1.el6rhs.x86_64
glusterfs-geo-replication-3.6.0.36-1.el6rhs.x86_64
glusterfs-fuse-3.6.0.36-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.36-1.el6rhs.x86_64
glusterfs-libs-3.6.0.36-1.el6rhs.x86_64
glusterfs-server-3.6.0.36-1.el6rhs.x86_64
glusterfs-debuginfo-3.6.0.36-1.el6rhs.x86_64

How reproducible:

100%

Steps to Reproduce:

1.create 1*2 distribute replicate volume
2.set the volume options 'metadata-self-heal' , 'entry-self-heal' and 'data-self-heal' to value “off”
3. set self-heal-daemon off
4. Create a nfs mount
5  create a file on mount point e.g echo "file before snapshot" >> file
6  Now bring one brick down and modify the file content e.g echo "B1 is down" >> file
7  Bring the brick up
7  Create snapshot snap1 and activate it
8  enable the USS e.g gluster volume set testvol features.uss enable
9  Now read the content of Snap1
10 file content are from wrong brick source 


Actual results:

file content are from brick which was down while modifying file

Expected results:

file content should be from brick which up down while modifying file

Additional info:

[root@rhsauto001 arequal]# gluster v info
 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 8c67bd61-1de4-42f3-a919-b40d6fa1e009
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.143:/rhs/brick1/b1
Brick2: 10.70.47.145:/rhs/brick1/b2
Brick3: 10.70.47.150:/rhs/brick1/b3
Brick4: 10.70.47.151:/rhs/brick1/b4
Options Reconfigured:
cluster.self-heal-daemon: off
features.barrier: disable
features.uss: enable
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.metadata-self-heal: off
performance.open-behind: off
performance.quick-read: off
performance.io-cache: off
performance.read-ahead: off
performance.write-behind: off
performance.readdir-ahead: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256


File content from mount point 

[root@client testsnap]# cat file 
file before snapshot
B1 is down


file content from USS

[root@client testsnap]# cat file 
file before snapshot

Comment 2 Sachin Pandit 2014-12-09 13:13:19 UTC
Vijaykumar and I did a RCA on this. We found out that the lookup
is not being traversed till AFR when a readdir has already happened as snapview server returns a cached data, because of this data was not being fetched from the right brick as AFR did not have any role to play. We are working to resolve this issue.

Comment 3 Sachin Pandit 2014-12-10 07:35:09 UTC
On the second run, we found out that the data is served using md-cache. Need to figure out how we can effectively override md-cache is some of the corner cases.

Comment 4 Vijaikumar Mallikarjuna 2014-12-11 11:50:26 UTC
Currently there is a known issue with AFR when md-cache is enabled and a readdirp is performed

Problem here is when a readdirp is performed, md-cache and glfsapi both caches the stat data and serves cached data when a lookup comes from the client.
Because of this lookup will not reach AFR. AFR will decide from which source data needs to be severed only in the lookup_cbk.
lookup is not reaching AFR, hence the issue.

Also because of glfsapi limitation, we cannot send explicit lookup from snapview server without creating a new 'glfs_object_t' handle.

Workaround can be one of the below:
1) Disable readdirp on snapview server
2) Disable md-cache for snapshots in sanpview server


Note You need to log in before you can comment on or make changes to this bug.