Bug 1266359 - folder not listing after fix-layout
Summary: folder not listing after fix-layout
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.7.3
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard: dht-directory-consistency, dht-must-fix
Depends On: 1248393 1483402 1483828
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-25 06:33 UTC by amudhan83
Modified: 2019-12-16 10:41 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-03-08 10:48:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:
amudhan83: needinfo-


Attachments (Terms of Use)
ls of root folder before and after running fix-layout (8.15 KB, text/plain)
2015-09-25 06:33 UTC, amudhan83
no flags Details

Description amudhan83 2015-09-25 06:33:00 UTC
Created attachment 1076877 [details]
ls of root folder before and after running fix-layout

Description of problem:
Folder not listing after adding new bricks and starting rebalance fix-layout 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.20 node with each 36 bricks

2. create disperse vol with 8 + 2 adding node{1..10}:/brick1 node{1..10}:/brick2 likewise

3. create folder hierarchy like this.

/root1/subroot1/aaaa/{a..z}/par{1..500}/chil{1..3}
/root1/subroot1/bbbb/{a..z}/par{1..500}/chil{1..3}
/root1/subroot2/aaaa/{a..z}/par{1..500}/chil{1..3}
/root1/subroot2/bbbb/{a..z}/par{1..500}/chil{1..3}
/root2/aaaa/{a..z}/par{1..500}/chil{1..3}
/root3/aaaa/{a..z}/par{1..500}/chil{1..3}
/root3/bbbb/{a..z}/par{1..500}/chil{1..3}
/root3/bbbb/{a..z}/par{1..500}/chil{1..3}

in child folder create 10 files

ls -R to see all folders all listing 

in existing cluster add new bricks.

add brick node{21..30}:/brick1 node{21..30}:/brick2 likewise

start rebalance or fix-layout now.

ls -R to see all folders all listing (some will be missing)

Actual results:

random folders are not listing for some time

Expected results:

It should list all folder even when rebalance is running

Additional info:

starting fix-layout makes folders not availble till it completes, it takes a day to list some folders.

Comment 1 Backer 2015-09-26 06:43:09 UTC
""Folders list before  starting fix layout.""

 root@root1:~$ date && ls /mnt/gluster/ Tue Sep 22 13:57:38 IST 2015 
 DTS  Incoming  Packages  Prores 

root@root1:~$ date && ls /mnt/gluster/Packages/Features/DCI/
 Tue Sep 22 13:57:59 IST 2015
 0  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X Y  Z

 root@root1:~$ date && ls /mnt/gluster/Packages/Features/MPEG/
 Tue Sep 22 13:58:25 IST 2015
 0  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X Y  Z

 ""Folders list after starting fix layout.""

 root@root1:~$ date && ls /mnt/gluster/
 Tue Sep 22 15:21:08 IST 2015 
 Incoming  Packages  Prores 

***** ls doesn't show DTS folder,which is listed before running fix-layout ********

 root@root1:~$ date && ls  /mnt/gluster/Packages/Features/DCI/
 Tue Sep 22 15:21:14 IST 2015
 0  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X Y  Z 

 root@root1:~$ date && ls  /mnt/gluster/Packages/Features/MPEG/
 Tue Sep 22 15:21:18 IST 2015
 0  A  C  E  G  H  I  J  K  L  N  P  Q  R  S  T  U  Y

**** Here some sub folders are missing (B,D,F,M,O,V,W,X,Z)**********

Comment 2 Nithya Balachandran 2015-09-29 08:52:53 UTC
xattr information for the volume brick roots were sent by Amudhan over email:

For DTS :


./hashcompute DTS
Name = DTS, hash = 3885892214, (hex = 0xe79e0276)

Looking at the xattrs set on the brick root on the newly added nodes, the subvol whose range this falls into is qubevaultdr-disperse-97:

From  qubevaultdrdn021:

qubevaultdrdn021
Packages
# file: media/disk26/brick26/
trusted.ec.version=0x00000000000000010000000000000020
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x0000000100000000e5ed0937e84bd9cd
trusted.glusterfs.dht.commithash=0x3239353333363538393000
trusted.glusterfs.volume-id=0x2b575b5cdf2e449cabb9c56cec27e609


As DTS does exist on this subvolume so readdirp never returned it. As this is the hashed subvolume, this entry will not be listed until DTS is created on this subvol by a directory heal operation.


This happens because the fix-layout is performed in a depth first manner. In a volume with a very large number of directories, it can take a considerable amount of time for some directories to be healed and hence be visible.

Comment 3 Niels de Vos 2015-09-29 12:17:48 UTC
Amundhan, Backer, was this resolved with a directory heal as suggested by Nithya?

Comment 4 Backer 2015-09-29 12:38:11 UTC
After running the fix-layout, we are able to see all the files and folders now. It took nearly 48 hours to complete the fix-layout to display all the folders.
So current implementation of distributed disperse volume doesn't provide high availability of all the folders. Some folders was missing during fix-layout. So we can't do any file operations on missing folders during fix-layout. It needs to be addressed.

Comment 5 Kaushal 2017-03-08 10:48:24 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.


Note You need to log in before you can comment on or make changes to this bug.