Bug 1005474

Summary: DHT : wrong holes and overlaps count in anomalies log message
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rachana Patel <racpatel>
Component: distributeAssignee: Nithya Balachandran <nbalacha>
Status: CLOSED DEFERRED QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: kdhananj, mzywusko, nsathyan, rhs-bugs, spalai, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1286162 (view as bug list) Environment:
Last Closed: 2015-11-27 12:07:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1002987, 1286162    

Description Rachana Patel 2013-09-07 12:08:38 UTC
Description of problem:
DHT : wrong holes and overlaps count in anomalies log message 

Version-Release number of selected component (if applicable):
3.4.0.32rhs-1.el6_4.x86_64

How reproducible:
always

Steps to Reproduce:
1.creatd DHT volume and FUSE mount it
2.one brick is down ; create dir and files

[root@DHT1 1]# gluster v status
Status of volume: testdht
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.195:/rhs/brick1/1			N/A	N	435
Brick 10.70.37.195:/rhs/brick1/2			49167	Y	2048
Brick 10.70.37.98:/rhs/brick1/1				49154	Y	29098
NFS Server on localhost					2049	Y	2327
NFS Server on 10.70.37.98				2049	Y	29111
 
There are no active volume tasks

mount:-
[root@rhs-client22 dhttest]# cd up1
[root@rhs-client22 up1]# mkdir down{1..10}
mkdir: cannot create directory `down1': Transport endpoint is not connected
mkdir: cannot create directory `down2': Transport endpoint is not connected
mkdir: cannot create directory `down4': Transport endpoint is not connected
mkdir: cannot create directory `down5': Transport endpoint is not connected
mkdir: cannot create directory `down10': Transport endpoint is not connected
[root@rhs-client22 up1]# cd down1
-bash: cd: down1: No such file or directory
[root@rhs-client22 up1]# cd down3
[root@rhs-client22 down3]# touch down{1..10}
[root@rhs-client22 down3]# cd ..
[root@rhs-client22 up1]# touch fdown{1..10}
touch: cannot touch `fdown6': Transport endpoint is not connected
touch: cannot touch `fdown9': Transport endpoint is not connected


3. make that brick up and from mount point issue lookup on that dir
[root@DHT1 1]# gluster v status
Status of volume: testdht
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.195:/rhs/brick1/1			49166	Y	435
Brick 10.70.37.195:/rhs/brick1/2			49167	Y	2048
Brick 10.70.37.98:/rhs/brick1/1				49154	Y	29098
NFS Server on localhost					2049	Y	2327
NFS Server on 10.70.37.98				2049	Y	29111
 
There are no active volume tasks

mount:-
[root@rhs-client22 up1]# ls -l
total 0
drwxr-xr-x. 2 root root 139 Sep  7 02:07 down3
drwxr-xr-x. 2 root root  18 Sep  7 02:07 down6
drwxr-xr-x. 2 root root  18 Sep  7 02:07 down7
drwxr-xr-x. 2 root root  18 Sep  7 02:07 down8
drwxr-xr-x. 2 root root  18 Sep  7 02:07 down9
-rw-r--r--. 1 root root   0 Sep  7 04:22 f1
-rw-r--r--. 1 root root   0 Sep  7 04:22 f10
-rw-r--r--. 1 root root   0 Sep  7 04:22 f2
-rw-r--r--. 1 root root   0 Sep  7 04:22 f3
-rw-r--r--. 1 root root   0 Sep  7 04:22 f4
-rw-r--r--. 1 root root   0 Sep  7 04:22 f5
-rw-r--r--. 1 root root   0 Sep  7 04:22 f6
-rw-r--r--. 1 root root   0 Sep  7 04:22 f7
-rw-r--r--. 1 root root   0 Sep  7 04:22 f8
-rw-r--r--. 1 root root   0 Sep  7 04:22 f9
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown1
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown10
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown2
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown3
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown4
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown5
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown7
-rw-r--r--. 1 root root   0 Sep  7 04:24 fdown8


backend:-

previously down
[root@DHT1 1]# getfattr -d -m . -e hex /rhs/brick1/1/up1/down3
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/1/up1/down3
trusted.gfid=0xc052b59c4d644d0f9002136f83975e10

[root@DHT1 1]# getfattr -d -m . -e hex /rhs/brick1/2/up1/down3
# file: rhs/brick1/2/up1/down3
trusted.gfid=0xc052b59c4d644d0f9002136f83975e10
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
[root@DHT3 1]# getfattr -d -m . -e hex /rhs/brick1/*/up1/down3
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/1/up1/down3
trusted.gfid=0xc052b59c4d644d0f9002136f83975e10
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe


log says:-
[2013-09-07 08:26:25.271806] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down3. holes=1 overlaps=0 missing=1 down=0 misc=0
[2013-09-07 08:26:25.276118] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down6. holes=1 overlaps=0 missing=1 down=0 misc=0
[2013-09-07 08:26:25.280470] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down7. holes=1 overlaps=0 missing=1 down=0 misc=0
[2013-09-07 08:26:25.284581] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down8. holes=1 overlaps=0 missing=1 down=0 misc=0
[2013-09-07 08:26:25.288604] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down9. holes=1 overlaps=0 missing=1 down=0 misc=0

--> wrong holes count; there are no holes


Case 2:-
4) make some other sub-vol down


[root@DHT1 1]# gluster v status
Status of volume: testdht
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick 10.70.37.195:/rhs/brick1/1			49166	Y	2420
Brick 10.70.37.195:/rhs/brick1/2			49167	Y	2048
Brick 10.70.37.98:/rhs/brick1/1				N/A	N	29098
NFS Server on localhost					2049	Y	2433
NFS Server on 10.70.37.98				2049	Y	29174
 
There are no active volume tasks

backend:-
previously down:-
[root@DHT1 1]# getfattr -d -m . -e hex /rhs/brick1/1/up1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/1/up1/
trusted.gfid=0xfabc7fc9a0e84ea4845d714deb5b66d7
trusted.glusterfs.dht=0x00000001000000000000000055555554

[root@DHT1 1]# getfattr -d -m . -e hex /rhs/brick1/2/up1/
# file: rhs/brick1/2/up1/
trusted.gfid=0xfabc7fc9a0e84ea4845d714deb5b66d7
trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9

currently down:-
[root@DHT3 1]# getfattr -d -m . -e hex /rhs/brick1/*/up1/
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick1/1/up1/
trusted.gfid=0xfabc7fc9a0e84ea4845d714deb5b66d7
trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff


[2013-09-07 08:58:43.238110] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1. holes=1 overlaps=1 missing=0 down=1 misc=0

---> holes=1 is ok, why overlaps =1 ; wrong overlap count

mount:-
[root@rhs-client22 up1]# ls -lr down3
total 0
-rw-r--r--. 1 root root 0 Sep  7 04:24 down9
-rw-r--r--. 1 root root 0 Sep  7 04:24 down8
-rw-r--r--. 1 root root 0 Sep  7 04:24 down7
-rw-r--r--. 1 root root 0 Sep  7 04:24 down6
-rw-r--r--. 1 root root 0 Sep  7 04:24 down5
-rw-r--r--. 1 root root 0 Sep  7 04:24 down4
-rw-r--r--. 1 root root 0 Sep  7 04:24 down3
-rw-r--r--. 1 root root 0 Sep  7 04:24 down2
-rw-r--r--. 1 root root 0 Sep  7 04:24 down10
-rw-r--r--. 1 root root 0 Sep  7 04:24 down1

log:-
[2013-09-07 08:58:44.979442] I [dht-layout.c:633:dht_layout_normalize] 0-testdht-dht: found anomalies in /up1/down3. holes=2 overlaps=0 missing=0 down=1 misc=0

--> wrong holes count


Actual results:
 wrong holes and overlaps count in anomalies log message 

Expected results:


Additional info:

Comment 3 Scott Haines 2013-09-27 17:08:10 UTC
Targeting for 3.0.0 (Denali) release.

Comment 4 Nagaprasad Sathyanarayana 2014-05-06 11:43:41 UTC
Dev ack to 3.0 RHS BZs