Description of problem: When User adds new brick to DHT volume, self heal should create existing Directories on newly added brick also but It should not fix layout for the same. Currently, self heal is creating Directories on newly added bricks and also Fixes hash layout for few Directories. Version-Release number of selected component (if applicable): 3.4.0.53rhs-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. create DHT volume and mount using FUSE 2. create Directories from mount point 4. Add new brick to this volume and do not run rebalance 5. from mount point execute commands which look for those directories. 6. verify hash layout for those directories on newly added brick Actual results: Directory hash layout on newly added brick # file: rhs/brick1/b5/new20 trusted.gfid=0x3bcc2d8e106c4b37804ea4dee34566e6 trusted.glusterfs.dht=0x00000001000000000000000033333332 <------------ # file: rhs/brick1/b5/new21 trusted.gfid=0x2bfeeeccc18b4e5bac002bc7e04c23d4 trusted.glusterfs.dht=0x00000001000000000000000000000000 # file: rhs/brick1/b5/new22 trusted.gfid=0xc10feccfcae448dd856d44436b9fd8bc trusted.glusterfs.dht=0x00000001000000000000000000000000 # file: rhs/brick1/b5/new23 trusted.gfid=0x0afb356f1ea448deb32a211fd6049aad trusted.glusterfs.dht=0x00000001000000000000000033333332 <-------------- Expected results: hash layout for existing Directory should not be fixed by self heal, rebalance should do it Additional info:
Created attachment 846999 [details] log
Created attachment 847001 [details] log
Tested on Corbett latest code base from RHS and was able to reproduce the issue. Was also able to reproduce the issue on code base from the v3.4.0.33rhs tag which was shipped (I think) for BigBend. So this does not seem to be a regression between releases. The issue is consistent when we start with 3 bricks and then add the 4th brick with the mentioned steps. Raghavendra Talur is also running the same test on BigBend and Corbett to confirm the same. To ensure that this is not a regression introduced in Corbett. Currently the suspicion is that the following commit is causing the issue (not confirmed yet), commit d5c275864b42ceaa7fdea3610fca2f75fa48526e cluster/dht: assign layout onto missing directories too Steps followed: 1) Create a 3 brick setup gluster volume create test somari:/tmp/brick_1 somari:/tmp/brick_2 somari:/tmp/brick_3 2) Create 20 dirs in the FUSE mounted "test" volume for i in {1..20}; do mkdir /mnt/glusterfs/bug1049766/dir$i; done 3) Check the layout in the backend for i in {1..20}; do getfattr -d -m . -e hex /tmp/brick_1/bug1049766/dir$i; getfattr -d -m . -e hex /tmp/brick_2/bug1049766/dir$i; getfattr -d -m . -e hex /tmp/brick_3/bug1049766/dir$i; getfattr -d -m . -e hex /tmp/brick_4/bug1049766/dir$i; echo -----------------------------------------------; done 4) Add a brick to the volume gluster volume add-brick test somari:/tmp/brick_4 5) Sanity check (3) again 6) Access each directory for i in {1..20}; do ls -l /mnt/glusterfs/bug1049766/dir$i; done 7) Check dirs created on new brick ls -l /tmp/brick_4/bug1049766/ 8) Repeat (3) and check if any of the layouts in the 4th newly added brick is non 0-0 (I and Rachana get non 0-0 for some, which is the bug, but I get it for BigBend as well) example output of step (8): ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: tmp/brick_1/bug1049766/dir20 trusted.gfid=0xf7c331f78f0e49ac99baf97ed5a0861f trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc getfattr: Removing leading '/' from absolute path names # file: tmp/brick_2/bug1049766/dir20 trusted.gfid=0xf7c331f78f0e49ac99baf97ed5a0861f trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff getfattr: Removing leading '/' from absolute path names # file: tmp/brick_3/bug1049766/dir20 trusted.gfid=0xf7c331f78f0e49ac99baf97ed5a0861f trusted.glusterfs.dht=0x0000000100000000000000003ffffffe getfattr: Removing leading '/' from absolute path names # file: tmp/brick_4/bug1049766/dir20 trusted.gfid=0xf7c331f78f0e49ac99baf97ed5a0861f trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd -----------------------------------------------
Adding: There is no data loss or functional loss due to the behaviour as well, as the new layout is set for some directories and DHT operations would continue as required in these directories. Talur is testing using the RPMs shipped for BigBend in case there is a commit or 2 discrepancy between shipped and tagged versions (if any)
I tested with rpm build 33 downloaded from https://brewweb.devel.redhat.com/buildinfo?buildID=293658. I got layout set for all the dirs in the new brick..Here are the results getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir1 trusted.gfid=0x8d839427ef244457b21bc8ffd95b843d trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir2 trusted.gfid=0xf228c0c914234ca888b16b122c296f85 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir3 trusted.gfid=0xa0949cae66c247c3b175e2039a516a6d trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir4 trusted.gfid=0x161ce29482284e3b8bfda78fe105419b trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir5 trusted.gfid=0x5e6eba7f3ee648cc80c457dbd6e46ad7 trusted.glusterfs.dht=0x0000000100000000000000003ffffffe ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir6 trusted.gfid=0x99b4c7c6429c4ff383162954142c9881 trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir7 trusted.gfid=0xab0ad221cc9a4e52a3a64544d9910f52 trusted.glusterfs.dht=0x0000000100000000000000003ffffffe ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir8 trusted.gfid=0xca3a161c9ab6457b93ae1768fc94972a trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir9 trusted.gfid=0xe942ce81c0904b5186e98f3931686ac5 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir10 trusted.gfid=0x3842d2a0f49a42b38a27e7a41b89c811 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir11 trusted.gfid=0xaac88754de4044c3b1ad64f5375fbdee trusted.glusterfs.dht=0x0000000100000000000000003ffffffe ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir12 trusted.gfid=0x353bde1f25d44328a3b17914b6d8e02e trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir13 trusted.gfid=0xeaa6e91c04054a94aea6cde5d888b765 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir14 trusted.gfid=0x916d38f74b094b7bbac4e9a3cd4882f2 trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir15 trusted.gfid=0x7e09c50f10094932bc6aede2b50c0999 trusted.glusterfs.dht=0x0000000100000000000000003ffffffe ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir16 trusted.gfid=0x13b36161c0f4423bb2b92f9cd4a4709c trusted.glusterfs.dht=0x0000000100000000000000003ffffffe ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir17 trusted.gfid=0x72ca8e9372664e2d8ffcecc9044af699 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir18 trusted.gfid=0x17150d457db34286bd89c7444c44c680 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir19 trusted.gfid=0x27e2d31e1cf24f46b8f7f99bc40d66d5 trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc ----------------------------------------------- getfattr: Removing leading '/' from absolute path names # file: data/gluster/brick4/brick/bug1049766/dir20 trusted.gfid=0x4a512d89dbdc41b38412b5b0e67a3436 trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd -----------------------------------------------
Per 1/16 triage.
Per triage 1/16, Removing from the Corbett list
Cloning this bug in 3.1. To be fixed in future release.