| Summary: | DHT + rebalance : after rebalance crash many Directory has overlapping hash layout | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Rachana Patel <racpatel> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED DEFERRED | QA Contact: | storage-qa-internal <storage-qa-internal> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 2.1 | CC: | mzywusko, spalai, vbellur | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1286166 (view as bug list) | Environment: | ||
| Last Closed: | 2015-11-27 12:11:49 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1286166 | |||
Cloning this to 3.1. to be fixed in future release. |
Description of problem: Rebalance process was crashed on all servers. so It's possible that those rebalance processes might be in the middle of fixing layout(setting new hash layout) and couldn't complete on all node due to crash. But even in that case directories having overlap can not be more than no. of rebalance process . But we found many Directories having overlap layout. e.g [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc1 trusted.gfid=0x5c3c20c395304ab1ad59e4969d84ca0b trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc1 trusted.gfid=0x5c3c20c395304ab1ad59e4969d84ca0b trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc1 trusted.gfid=0x5c3c20c395304ab1ad59e4969d84ca0b trusted.glusterfs.dht=0x000000010000000099999999cccccccb <------------------- [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc1 trusted.gfid=0x5c3c20c395304ab1ad59e4969d84ca0b trusted.glusterfs.dht=0x0000000100000000000000003ffffffe [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc1 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc1 trusted.gfid=0x5c3c20c395304ab1ad59e4969d84ca0b trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd 2) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc3 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc3 trusted.gfid=0xc0386717b0a64231aa83d260db2670cd trusted.glusterfs.dht=0x00000001000000003333333366666665 <--------------------- [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc3 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc3 trusted.gfid=0xc0386717b0a64231aa83d260db2670cd trusted.glusterfs.dht=0x0000000100000000000000003ffffffe [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc3 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc3 trusted.gfid=0xc0386717b0a64231aa83d260db2670cd trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc3 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc3 trusted.gfid=0xc0386717b0a64231aa83d260db2670cd trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc3 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc3 trusted.gfid=0xc0386717b0a64231aa83d260db2670cd trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc 3) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc2 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc2 trusted.gfid=0x71bdf4e4058b4923818dcde827a30b5a trusted.glusterfs.dht=0x0000000100000000000000003ffffffe [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc2 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc2 trusted.gfid=0x71bdf4e4058b4923818dcde827a30b5a trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc2 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc2 trusted.gfid=0x71bdf4e4058b4923818dcde827a30b5a trusted.glusterfs.dht=0x00000001000000006666666699999998 <----------------------- [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc2 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc2 trusted.gfid=0x71bdf4e4058b4923818dcde827a30b5a trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc2 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc2 trusted.gfid=0x71bdf4e4058b4923818dcde827a30b5a trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc 4) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc4 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc4 trusted.gfid=0x200ecbd616694d8180985c3f68d3d86e trusted.glusterfs.dht=0x0000000100000000ccccccccffffffff <---------------------- [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc4 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc4 trusted.gfid=0x200ecbd616694d8180985c3f68d3d86e trusted.glusterfs.dht=0x0000000100000000000000003ffffffe [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc4 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc4 trusted.gfid=0x200ecbd616694d8180985c3f68d3d86e trusted.glusterfs.dht=0x0000000100000000bffffffdffffffff [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc4 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc4 trusted.gfid=0x200ecbd616694d8180985c3f68d3d86e trusted.glusterfs.dht=0x00000001000000003fffffff7ffffffd [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc4 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc4 trusted.gfid=0x200ecbd616694d8180985c3f68d3d86e trusted.glusterfs.dht=0x00000001000000007ffffffebffffffc 5) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc5 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc5 trusted.gfid=0xb6b2ff31d7f14a5a94275eeccf637faa trusted.glusterfs.dht=0x00000001000000000000000000000000 [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc5 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc5 trusted.gfid=0xb6b2ff31d7f14a5a94275eeccf637faa trusted.glusterfs.dht=0x00000001000000000000000055555554 [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc5 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc5 trusted.gfid=0xb6b2ff31d7f14a5a94275eeccf637faa trusted.glusterfs.dht=0x00000001000000000000000033333332 <------------------ [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc5 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc5 trusted.gfid=0xb6b2ff31d7f14a5a94275eeccf637faa trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc5 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc5 trusted.gfid=0xb6b2ff31d7f14a5a94275eeccf637faa trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 6) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc8 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc8 trusted.gfid=0x90036c7183ab40f09b5e21fa795d0619 trusted.glusterfs.dht=0x00000001000000000000000000000000 [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc8 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc8 trusted.gfid=0x90036c7183ab40f09b5e21fa795d0619 trusted.glusterfs.dht=0x00000001000000000000000055555554 [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc8 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc8 trusted.gfid=0x90036c7183ab40f09b5e21fa795d0619 trusted.glusterfs.dht=0x00000001000000000000000033333332 <--------------- [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc8 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc8 trusted.gfid=0x90036c7183ab40f09b5e21fa795d0619 trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc8 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc8 trusted.gfid=0x90036c7183ab40f09b5e21fa795d0619 trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff 7) [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x00000001000000006666666699999998 <---------------- [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick4/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick4/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x00000001000000006666666699999998 [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick2/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick2/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff [root@7-VM4 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x00000001000000000000000000000000 [root@7-VM3 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 [root@7-VM1 ~]# getfattr -d -m . -e hex /rhs/brick1/f/mvs1/mvetc9 getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/f/mvs1/mvetc9 trusted.gfid=0xc767aeb557094f4497556d7f2d7969b8 trusted.glusterfs.dht=0x00000001000000000000000055555554 there were many Directories like this..listing only few to prove that no. of Directories in such conditions were greater than no of rebalance process Version-Release number of selected component (if applicable): ============================================ 3.4.0.44rhs-1.el6rhs.x86_64 How reproducible: ================== haven't tried Steps to Reproduce: ==================== 1. create and mount DHT volume. Create Data from mount point(Directory depth was 10) 2.add brick to volume and start rebalance. 3. while rebalance is in progress, perform rename operation for directories and files 3. after 44+ hours rebalance process was crashed on all node and rebalance status was 'failed' [root@7-VM1 core]# gluster volume rebalance flat status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 832000 13.7GB 5344344 1 228 failed 159836.00 10.70.36.133 1009405 15.7GB 5362837 2 206 failed 159836.00 10.70.36.132 823206 12.9GB 5416604 1 233 failed 159836.00 10.70.36.131 0 0Bytes 5227829 0 0 failed 159836.00 volume rebalance: flat: success: 4. verify hash layout of Directories to find overlap or holes. and as mentined above found overlap for many Directories Actual results: overlap in hash layout for many Directories Expected results: Even-though there was crash in rebalance process, no. of Directories having overlap/holes due to process in progress should not be more than no. of rebalance process. Additional info: