Bug 893378 - DHT - User is able to modify file when cached sub-volume is down and hashed sub-volume is up it results in data lost and at same level multiple files can be created having same name
Summary: DHT - User is able to modify file when cached sub-volume is down and hashed s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Kaushal
QA Contact: amainkar
URL:
Whiteboard:
: 903917 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-09 09:41 UTC by Rachana Patel
Modified: 2015-04-20 11:56 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.4.0qa8, glusterfs-3.3.0.5rhs-42
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:34:48 UTC
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2013-01-09 09:41:58 UTC
Description of problem:
DHT - User is able to modify file when cached sub-volume is down and hashed sub-volume is up it results in data  lost and at same level multiple files can be created having same name

Version-Release number of selected component (if applicable):
glusterfs-3.4.0qa5-1.el6rhs.x86_64

How reproducible:
always

Steps to Reproduce:

1. Create a Distributed volume having 3 or more sub-volumes on multiple server and start that volume.

2. Fuse Mount the volume from the client-1 using “mount -t glusterfs  server:/<volume> <client-1_mount_point>”

3. From mount point create some dirs and files inside it. Execute rename command for files and make sure that because of rename operation hashed and cached sub-volum eis different for that file

server 3:-
-bash-4.1# ls -l renamefile10
---------T 2 root root    0 Jan  9 06:49 renamefile10

server 1:-
-bash-4.1# stat renamefile10
  File: `renamefile10'
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 810h/2064d	Inode: 273839861   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-01-09 09:10:31.647469095 +0000
Modify: 2013-01-09 06:39:41.000000000 +0000

4. Bring sub-volume down where file is cached. In our case it's server 1.

5. from mount point modify that file using vi/vim.
[root@client verify]# vi renamefile10
do some changes, save and quit

6. Bring all the sub-vols up

7. Now from mount point execute ls command and verify that  renamefile10 is listed twice

[root@client verify]# ls
d1   d16  d22  d29  d35  d41  d48  d9            renamefile14  renamefile20  renamefile27  renamefile33  renamefile4   renamefile46  renamefile7
d10  d17  d23  d3   d36  d42  d49  renamefile1   renamefile15  renamefile21  renamefile28  renamefile34  renamefile40  renamefile47  renamefile8
d11  d18  d24  d30  d37  d43  d5   renamefile10  renamefile16  renamefile22  renamefile29  renamefile35  renamefile41  renamefile48  renamefile9
d12  d19  d25  d31  d38  d44  d50  renamefile10  renamefile17  renamefile23  renamefile3   renamefile36  renamefile42  renamefile49
d13  d2   d26  d32  d39  d45  d6   renamefile11  renamefile18  renamefile24  renamefile30  renamefile37  renamefile43  renamefile5
d14  d20  d27  d33  d4   d46  d7   renamefile12  renamefile19  renamefile25  renamefile31  renamefile38  renamefile44  renamefile50
d15  d21  d28  d34  d40  d47  d8   renamefile13  renamefile2   renamefile26 renamefile32  renamefile39  renamefile45  renamefile6



8. check files on backend

server 1:-
-bash-4.1# stat renamefile10
  File: `renamefile10'
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 810h/2064d	Inode: 273839861   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-01-09 09:10:31.647469095 +0000
Modify: 2013-01-09 06:39:41.000000000 +0000
Change: 2013-01-09 06:49:43.233807527 +0000

server 3:-
-bash-4.1# getfattr -d -m . renamefile10
# file: renamefile10
trusted.gfid=0snL5DI4LxQKu/FKMnTyD2WQ==

-bash-4.1# stat renamefile10
  File: `renamefile10'
  Size: 2         	Blocks: 8          IO Block: 4096   regular file
Device: 810h/2064d	Inode: 145987204   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-01-09 09:10:31.649531504 +0000
Modify: 2013-01-09 09:06:55.293946024 +0000
Change: 2013-01-09 09:06:55.293946024 +0000

  
Actual results:
User is able to modify file when cached sub-volume is down

Expected results:
User should not be able to modify file when cached sub-volume is down

Comment 3 shishir gowda 2013-01-16 05:15:18 UTC
Upstream fix http://review.gluster.org/#change,4383 in review

Comment 4 Vijay Bellur 2013-01-21 20:03:16 UTC
CHANGE: http://review.gluster.org/4383 (cluster/distribute: If cached_subvol is down, return ENOTCONN in lookup) merged in master by Anand Avati (avati)

Comment 5 shishir gowda 2013-01-29 04:24:10 UTC
*** Bug 903917 has been marked as a duplicate of this bug. ***

Comment 6 shishir gowda 2013-02-18 10:43:14 UTC
*** Bug 903476 has been marked as a duplicate of this bug. ***

Comment 7 Rachana Patel 2013-03-19 13:40:56 UTC
verified with 3.3.0.6rhs-4.el6rhs.x86_64

bug 893378 and bug 903917 working as per expectation

but 

Bug 903476 - not working as per expectation
e.g.
[root@rhsauto037 new]# cat renamefile18
cat: renamefile18: No such file or directory
[root@rhsauto037 new]# cp renamefile18 abc
cp: cannot stat `renamefile18': No such file or directory
[root@rhsauto037 new]# ls -l renamefile18
ls: cannot access renamefile18: No such file or directory
[root@rhsauto037 new]# chmod 777 f1
chmod: cannot access `f1': No such file or directory


hence moving back to assigned

Comment 15 Rachana Patel 2013-03-20 10:11:25 UTC
sorry for the inconvenience caused, logs are attached to the bug.

Comment 16 shishir gowda 2013-03-21 04:59:34 UTC
After investigating the logs, it looked like a issue fixed in bug 884379.
Updated the release to

[root@localhost ~]# rpm -qa |grep glusterfs
glusterfs-devel-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-server-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-geo-replication-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-debuginfo-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-fuse-3.3.0.6rhs-6.el6rhs.x86_64
glusterfs-rdma-3.3.0.6rhs-6.el6rhs.x86_64


1. Create a 3 - dht volume, mount and rename file till we get a linkfile

[root@localhost export]# mount -t glusterfs localhost:/test /mnt/dht/
[root@localhost export]# cd /mnt/dht/
[root@localhost dht]# ls
[root@localhost dht]# touch 1
[root@localhost dht]# mv 1 2
[root@localhost dht]# ls -l /export/*
/export/sub1:
total 0
---------T. 2 root root 0 Mar 21 03:28 2

/export/sub2:
total 0
-rw-r--r--. 2 root root 0 Mar 21 03:28 2

/export/sub3:
total 0

2. kill brick, and check status

[root@localhost ~]# gluster volume status
Status of volume: test
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick vm1:/export/sub1                                  24012   Y       29771
Brick vm1:/export/sub2                                  24013   N       N/A
Brick vm1:/export/sub3                                  24014   Y       29784
NFS Server on localhost                                 38467   Y       29790

3. Try to perform ops on the file.

[root@localhost dht]# cat 2
cat: 2: Transport endpoint is not connected
[root@localhost dht]# rm 2
rm: cannot remove `2': Transport endpoint is not connected
[root@localhost dht]# ls -l 2
ls: cannot access 2: Transport endpoint is not connected
[root@localhost dht]# mv 2 3
mv: cannot stat `2': Transport endpoint is not connected

Can you please rerun the test and check if the issue is fixed.

Comment 17 Scott Haines 2013-04-11 17:02:26 UTC
Per 04-10-2013 Storage bug triage meeting, targeting for Big Bend.

Comment 18 Kaushal 2013-06-12 07:44:52 UTC
I'm not able to reproduce this bug with the glusterfs-3.4.0.9rhs-1.el6rhs. 


[root@localhost mnt]# gluster volume info test 
Volume Name: test
Type: Distribute
Volume ID: 713ad1ed-96ca-459b-8728-0209439b972f
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.42.223:/brick/test1
Brick2: 10.70.42.223:/brick/test2

[root@localhost mnt]# mount | grep glusterfs
localhost:test on /mnt type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

[root@localhost mnt]# touch file
[root@localhost mnt]# ls -l /brick/*
/brick/test1:
total 0

/brick/test2:
total 0
-rw-r--r-- 2 root root 0 Jun 12 13:07 file

[root@localhost mnt]# mv file file.rename
[root@localhost mnt]# ls -l /brick/*
/brick/test1:
total 0
---------T 2 root root 0 Jun 12 13:07 file.rename

/brick/test2:
total 0
-rw-r--r-- 2 root root 0 Jun 12 13:07 file.rename
[root@localhost mnt]# gluster volume status test
Status of volume: test
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.223:/brick/test1                         49152   Y       28403
Brick 10.70.42.223:/brick/test2                         49153   Y       29028
NFS Server on localhost                                 2049    Y       29039
 
There are no active volume tasks
[root@localhost mnt]# kill 29028
[root@localhost mnt]# gluster volume status test
Status of volume: test
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.223:/brick/test1                         49152   Y       28403
Brick 10.70.42.223:/brick/test2                         N/A     N       N/A
NFS Server on localhost                                 2049    Y       29039
 
There are no active volume tasks

[root@localhost mnt]# ls
[root@localhost mnt]# touch file.rename
touch: cannot touch `file.rename': Transport endpoint is not connected
[root@localhost mnt]# cat file.rename
cat: file.rename: Transport endpoint is not connected
[root@localhost mnt]# echo "hello" > file.rename
-bash: file.rename: Transport endpoint is not connected



Can you rerun this test and confirm the same?

Comment 19 Nagaprasad Sathyanarayana 2013-06-18 09:00:48 UTC
Requesting QE to run the test as mentioned by Kaushal above.

Comment 20 Rachana Patel 2013-06-27 08:01:57 UTC
As mentioned in comment #7 of this bug
"
verified with 3.3.0.6rhs-4.el6rhs.x86_64

bug 893378 and bug 903917 working as per expectation

but 

Bug 903476 - not working as per expectation"

this bug was fixed but it got opened as one of its duplicate Bug 903476 was not working as per expectation.
As we have removed Bug 903476 from duplicate. we can mark this as verified.(bug 893378 and bug 903917 working as per expectation)

Comment 21 Scott Haines 2013-09-23 22:34:48 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.