| Summary: | DHT:REBALANCE- statfs failures are seen during rebalance | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | shylesh <shmohan> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED DEFERRED | QA Contact: | storage-qa-internal <storage-qa-internal> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 2.1 | CC: | spalai, vbellur | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1286200 1286207 (view as bug list) | Environment: | ||
| Last Closed: | 2015-11-27 12:26:51 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1286200, 1286207 | |||
Description of problem: while migrating files there are some statfs failures on some files while calculating free space. In the status this file is counted as 'skipped' Version-Release number of selected component (if applicable): 3.4.0.39rhs-1.el6rhs.x86_64 How reproducible: Always Steps to Reproduce: 1.created a 6x2 distributed-replicate volume 2.created deep directory using following script upto 6 levels of depth and 6 width ------------------------------------------ makedir () { local depth=$1 local n=$2 if [ $depth -eq 0 ]; then return fi for i in `seq 0 $2` do mkdir $i dd if=/dev/urandom of=file.$i bs=512k count=1 done depth=$(($depth - 1)) for i in `seq 0 $2` do pushd . cd $i makedir $depth $2 popd done } makedir $1 $2 ------------------------------------------------------------------------ 3. add-brick and start rebalance Actual results: I could see some of the file count in 'skipped' column of status further investigation of logs says "[2013-11-07 11:36:07.527205] E [dht-rebalance.c:357:__dht_check_free_space] 0-dist-rep-dht: failed to get statfs of /1/5/0/0/6/file.6 on dist-rep-replicate-1 (No s uch file or directory)" [root@rhs-client4 mnt]# getfattr -d -m . -e text /home/dist-rep3//1/5/0/0/6/file.6 getfattr: Removing leading '/' from absolute path names # file: home/dist-rep3//1/5/0/0/6/file.6 trusted.gfid="��k�E����GC�m�" trusted.glusterfs.dht.linkto="dist-rep-replicate-2" trusted.glusterfs.quota.ed8f09e1-46f2-4e0c-9bd5-5bd6c2f38cd1.contri="\000\000\000\000\000\000\000" trusted.pgfid.ed8f09e1-46f2-4e0c-9bd5-5bd6c2f38cd1="\000\000\000" from dist-rep-replicate-2 ------------------------ [root@rhs-client9 ~]# getfattr -d -m . -e text /home/dist-rep4//1/5/0/0/6/file.6 getfattr: Removing leading '/' from absolute path names # file: home/dist-rep4//1/5/0/0/6/file.6 trusted.afr.dist-rep-client-4="\000\000\000\000\000\000\000\000\000\000\000" trusted.afr.dist-rep-client-5="\000\000\000\000\000\000\000\000\000\000\000" trusted.gfid="��k�E����GC�m�" trusted.glusterfs.quota.ed8f09e1-46f2-4e0c-9bd5-5bd6c2f38cd1.contri="\000\000\000\000\00\000" trusted.pgfid.ed8f09e1-46f2-4e0c-9bd5-5bd6c2f38cd1="\000\000\000" [root@rhs-client4 mnt]# gluster v rebalance dist-rep status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 15885 7.8GB 59929 0 0 completed 6249.00 rhs-client9.lab.eng.blr.redhat.com 11332 5.5GB 55572 0 1 completed 6255.00 rhs-client39.lab.eng.blr.redhat.com 11973 5.8GB 57459 0 0 completed 6246.00 volume rebalance: dist-rep: success: volume info =========== [root@rhs-client4 mnt]# gluster v info dist-rep Volume Name: dist-rep Type: Distributed-Replicate Volume ID: 4d0c8f97-2e0d-4c1d-9628-898df3de12ed Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep0 Brick2: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep1 Brick3: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep2 Brick4: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep3 Brick5: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep4 Brick6: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep5 Brick7: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep6 Brick8: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep7 Brick9: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep8 Brick10: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep9 Brick11: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep10 Brick12: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep11 Brick13: rhs-client4.lab.eng.blr.redhat.com:/home/dist-rep12 Brick14: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep13 Brick15: rhs-client9.lab.eng.blr.redhat.com:/home/dist-rep14 Brick16: rhs-client39.lab.eng.blr.redhat.com:/home/dist-rep15 Options Reconfigured: features.quota: on cluster info ------------ RHS nodes ---------- rhs-client9.lab.eng.blr.redhat.com rhs-client39.lab.eng.blr.redhat.com rhs-client4.lab.eng.blr.redhat.com Mounted on ---------- rhs-client4.lab.eng.blr.redhat.com:/mnt attached the sosreports ---------------------