Bug 1030022

Summary: AFR : conservative merge of hardlinks and symbolic links fails
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: glusterfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED NOTABUG QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.1CC: vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-19 09:46:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description spandura 2013-11-13 17:56:36 UTC
Description of problem:
======================
Conservative merge of hardlink and symbolic links on a 1 x 3 replicate volume fails. Executing "volume heal <volume_name> info healed" shows all the files self-healed including hard links and symbolic links . The hardlink and symbolic file entries are created. But the links are broken. 

"ls <symbolic_link>/" returns "No Such File Or Directory". Hardlink to the file is removed .  

Version-Release number of selected component (if applicable):
==============================================================
glusterfs 3.4.0.35.1u2rhs built on Oct 21 2013 14:00:58

How reproducible:
================

Steps to Reproduce:
======================
1. Create a 1 x 3 replicate volume. Start the volume. Set "self-heal-daemon" to "off"

2. Create a fuse mount. From mount point create the following directories and files :

mkdir testdir1 ; 
mkdir testdir1/subdir1 ; 
dd if=/dev/urandom of=testdir1/subdir1/file1 bs=1M count=1 ;
dd if=/dev/urandom of=testdir1/file1 bs=1M count=1 ;

3. Bring down brick1 and brick2. 

4. From mount point execute the following :

rm -f testdir1/subdir1/file1 ;
mkdir testdir1/subdir1/file1 ; 

5. Bring back brick2 . Bring down brick3. 

6. From mount point execute the following : 

rm -f testdir1/file1
ln -s testdir1/subdir1/ testdir1/file1

7. Bring back brick1. Bring down brick2. 

8. From mount point execute the following. 

ln testdir1/file1 ./hard_link

9. Bring back brick2 and brick3 online. 

10. set "self-heal-daemon" to "on"

11. Trigger self-heal on the volume. 

Actual results:
==============
root@rhs-client11 [Nov-13-2013-17:36:04] >gluster v info
 
Volume Name: vol_rep
Type: Replicate
Volume ID: d75d19c8-fb2f-475e-915c-d24d4dede1e3
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: rhs-client11:/rhs/bricks/b1
Brick2: rhs-client12:/rhs/bricks/b1-rep1
Brick3: rhs-client13:/rhs/bricks/b1-rep2
Options Reconfigured:
cluster.self-heal-daemon: on
nfs.disable: on

root@rhs-client11 [Nov-13-2013-17:35:42] >gluster volume heal vol_rep info
Gathering list of entries to be healed on volume vol_rep has been successful 

Brick rhs-client11:/rhs/bricks/b1
Number of entries: 0

Brick rhs-client12:/rhs/bricks/b1-rep1
Number of entries: 0

Brick rhs-client13:/rhs/bricks/b1-rep2
Number of entries: 0

root@rhs-client11 [Nov-13-2013-17:35:50] >gluster volume heal vol_rep info healed
Gathering list of healed entries on volume vol_rep has been successful 

Brick rhs-client11:/rhs/bricks/b1
Number of entries: 1
at                    path on brick
-----------------------------------
2013-11-13 11:42:29 /

Brick rhs-client12:/rhs/bricks/b1-rep1
Number of entries: 2
at                    path on brick
-----------------------------------
2013-11-13 11:42:29 /testdir1
2013-11-13 11:50:36 /testdir1/file1

Brick rhs-client13:/rhs/bricks/b1-rep2
Number of entries: 2
at                    path on brick
-----------------------------------
2013-11-13 11:42:29 /testdir1/subdir1
2013-11-13 11:42:29 /testdir1/subdir1/file1


root@rhs-client11 [Nov-13-2013-17:35:54] >gluster volume heal vol_rep info heal-failed
Gathering list of heal failed entries on volume vol_rep has been successful 

Brick rhs-client11:/rhs/bricks/b1
Number of entries: 0

Brick rhs-client12:/rhs/bricks/b1-rep1
Number of entries: 0

Brick rhs-client13:/rhs/bricks/b1-rep2
Number of entries: 0


root@rhs-client11 [Nov-13-2013-17:36:00] >gluster volume heal vol_rep info split-brain
Gathering list of split brain entries on volume vol_rep has been successful 

Brick rhs-client11:/rhs/bricks/b1
Number of entries: 0

Brick rhs-client12:/rhs/bricks/b1-rep1
Number of entries: 0

Brick rhs-client13:/rhs/bricks/b1-rep2
Number of entries: 0

Brick1 :-
=========================
root@rhs-client11 [Nov-13-2013-17:44:27] >getfattr -d -e hex -m . /rhs/bricks/b1/hard_link 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1/hard_link
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0xb1a74bd8393f49a2b9acc2b7af5a0589

root@rhs-client11 [Nov-13-2013-17:45:38] >getfattr -d -e hex -m . /rhs/bricks/b1/testdir1/file1 
getfattr: /rhs/bricks/b1/testdir1/file1: No such file or directory
root@rhs-client11 [Nov-13-2013-17:45:49] >getfattr -h -d -e hex -m . /rhs/bricks/b1/testdir1/file1 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1/testdir1/file1
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0x4b28f34d69964bee97bd3797ad0d03d9

root@rhs-client11 [Nov-13-2013-17:45:59] >stat /rhs/bricks/b1/testdir1/file1
  File: `/rhs/bricks/b1/testdir1/file1' -> `testdir1/subdir1/'
  Size: 17        	Blocks: 0          IO Block: 4096   symbolic link
Device: fd02h/64770d	Inode: 503317707   Links: 2
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:31:00.148318000 +0000
Modify: 2013-11-13 11:31:00.148318000 +0000
Change: 2013-11-13 11:50:36.507465308 +0000

root@rhs-client11 [Nov-13-2013-17:46:46] >stat /rhs/bricks/b1/hard_link
  File: `/rhs/bricks/b1/hard_link'
  Size: 1048576   	Blocks: 2048       IO Block: 4096   regular file
Device: fd02h/64770d	Inode: 503317706   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:25:19.584037000 +0000
Modify: 2013-11-13 11:25:19.778028000 +0000
Change: 2013-11-13 12:04:39.454975411 +0000

root@rhs-client11 [Nov-13-2013-17:55:53] >ls -l /rhs/bricks/b1/testdir1/file1/
ls: cannot access /rhs/bricks/b1/testdir1/file1/: No such file or directory


Brick2:-
===============
root@rhs-client12 [Nov-13-2013-17:50:11] >getfattr -d -e hex -m . /rhs/bricks/b1-rep1/hard_link 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1-rep1/hard_link
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0xb1a74bd8393f49a2b9acc2b7af5a0589

root@rhs-client12 [Nov-13-2013-17:50:25] >
root@rhs-client12 [Nov-13-2013-17:50:26] >stat /rhs/bricks/b1-rep1/hard_link
  File: `/rhs/bricks/b1-rep1/hard_link'
  Size: 1048576   	Blocks: 2048       IO Block: 4096   regular file
Device: fd02h/64770d	Inode: 704743238   Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:25:19.584037000 +0000
Modify: 2013-11-13 11:25:19.778028000 +0000
Change: 2013-11-13 12:04:39.455399440 +0000
root@rhs-client12 [Nov-13-2013-17:50:28] >
root@rhs-client12 [Nov-13-2013-17:50:30] >getfattr -d -e hex -m . -h /rhs/bricks/b1-rep1/testdir1/file1 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1-rep1/testdir1/file1
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0x4b28f34d69964bee97bd3797ad0d03d9

root@rhs-client12 [Nov-13-2013-17:50:43] >
root@rhs-client12 [Nov-13-2013-17:50:45] >
root@rhs-client12 [Nov-13-2013-17:50:45] >stat /rhs/bricks/b1-rep1/testdir1/file1
  File: `/rhs/bricks/b1-rep1/testdir1/file1' -> `testdir1/subdir1/'
  Size: 17        	Blocks: 0          IO Block: 4096   symbolic link
Device: fd02h/64770d	Inode: 1476414598  Links: 2
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:31:00.148318769 +0000
Modify: 2013-11-13 11:31:00.148318769 +0000
Change: 2013-11-13 11:50:36.507342556 +0000
root@rhs-client12 [Nov-13-2013-17:51:01] >

root@rhs-client12 [Nov-13-2013-17:51:01] >ls -l /rhs/bricks/b1-rep1/testdir1/file1/
ls: cannot access /rhs/bricks/b1-rep1/testdir1/file1/: No such file or directory

Brick 3:-
===============
root@rhs-client13 [Nov-13-2013-17:52:43] >getfattr -d -e hex -m . /rhs/bricks/b1-rep2/hard_link 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1-rep2/hard_link
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0xb1a74bd8393f49a2b9acc2b7af5a0589

root@rhs-client13 [Nov-13-2013-17:52:53] >stat /rhs/bricks/b1-rep2/hard_link
  File: `/rhs/bricks/b1-rep2/hard_link'
  Size: 1048576   	Blocks: 2048       IO Block: 4096   regular file
Device: fd02h/64770d	Inode: 1543505162  Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:25:19.584037000 +0000
Modify: 2013-11-13 11:25:19.778028000 +0000
Change: 2013-11-13 12:04:39.434871867 +0000
root@rhs-client13 [Nov-13-2013-17:52:56] >
root@rhs-client13 [Nov-13-2013-17:53:41] >getfattr -d -e hex -m . -h /rhs/bricks/b1-rep2/testdir1/file1 
getfattr: Removing leading '/' from absolute path names
# file: rhs/bricks/b1-rep2/testdir1/file1
trusted.afr.vol_rep-client-0=0x000000000000000000000000
trusted.afr.vol_rep-client-1=0x000000000000000000000000
trusted.afr.vol_rep-client-2=0x000000000000000000000000
trusted.gfid=0x4b28f34d69964bee97bd3797ad0d03d9

root@rhs-client13 [Nov-13-2013-17:53:53] >
root@rhs-client13 [Nov-13-2013-17:53:55] >stat /rhs/bricks/b1-rep2/testdir1/file1
  File: `/rhs/bricks/b1-rep2/testdir1/file1' -> `testdir1/subdir1/'
  Size: 17        	Blocks: 0          IO Block: 4096   symbolic link
Device: fd02h/64770d	Inode: 1543505167  Links: 2
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:31:00.148318000 +0000
Modify: 2013-11-13 11:31:00.148318000 +0000
Change: 2013-11-13 11:50:36.483104761 +0000
root@rhs-client13 [Nov-13-2013-17:53:58] >
root@rhs-client13 [Nov-13-2013-17:53:59] >ls -l /rhs/bricks/b1-rep2/testdir1/file1/
ls: cannot access /rhs/bricks/b1-rep2/testdir1/file1/: No such file or directory

From mount point:-
===================
root@rhs-client14 [Nov-13-2013-17:40:18] >pwd
/mnt/gm1

root@rhs-client14 [Nov-13-2013-17:38:36] >ls -l
total 1024
-rw-r--r-- 1 root root 1048576 Nov 13 11:25 hard_link
drwxr-xr-x 3 root root      32 Nov 13 11:31 testdir1

root@rhs-client14 [Nov-13-2013-17:38:40] >stat hard_link
  File: `hard_link'
  Size: 1048576   	Blocks: 2048       IO Block: 131072 regular file
Device: 1dh/29d	Inode: 13379282687187617161  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2013-11-13 11:25:19.584037000 +0000
Modify: 2013-11-13 11:25:19.778028000 +0000
Change: 2013-11-13 12:04:39.454975411 +0000
root@rhs-client14 [Nov-13-2013-17:38:47] >

root@rhs-client14 [Nov-13-2013-17:38:49] >ls -l testdir1/
total 0
lrwxrwxrwx 1 root root 17 Nov 13 11:31 file1 -> testdir1/subdir1/
drwxr-xr-x 3 root root 18 Nov 13 11:29 subdir1


root@rhs-client14 [Nov-13-2013-17:39:02] >readlink testdir1/file1 
testdir1/subdir1/

root@rhs-client14 [Nov-13-2013-17:39:14] >ls -l testdir1/file1
lrwxrwxrwx 1 root root 17 Nov 13 11:31 testdir1/file1 -> testdir1/subdir1/

root@rhs-client14 [Nov-13-2013-17:39:17] >ls -l testdir1/file1/
ls: cannot access testdir1/file1/: No such file or directory

Expected results:
=================
Conservative merge of symbolic links and hard links should be successful.

Comment 3 spandura 2013-12-19 09:46:57 UTC
Re tested the case on build "glusterfs 3.4.0.49rhs built on Dec 11 2013 08:17:06". Case works file. 

Moving the case to closed state -> NotaBug