Hide Forgot
Description of problem: I had a 6x2 volume with quota enabled and 100GB of limit set started linux untar and did add brick followed by rebalance. after sometime, finding that the linux untar process is in "D+" state. [root@rhsauto032 ~]# gluster volume quota dist-rep3 list / Path Hard-limit Soft-limit Used Available -------------------------------------------------------------------------------- / 100.0GB 90% 4.0GB 96.0GB present volume status, --------------------- [root@rhsauto033 ~]# gluster volume status dist-rep3 Status of volume: dist-rep3 Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d1r 1-3 49167 Y 30085 Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d1r 2-3 49152 Y 27893 Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d2r 1-3 49152 Y 9693 Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d2r 2-3 49152 Y 9538 Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d3r 1-3 49168 Y 30096 Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d3r 2-3 49153 Y 27904 Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d4r 1-3 49153 Y 9704 Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d4r 2-3 49153 Y 9549 Brick rhsauto032.lab.eng.blr.redhat.com:/rhs/bricks/d5r 1-3 49169 Y 30107 Brick rhsauto033.lab.eng.blr.redhat.com:/rhs/bricks/d5r 2-3 49154 Y 27915 Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d6r 1-3 49154 Y 9715 Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d6r 2-3 49154 Y 9560 Brick rhsauto034.lab.eng.blr.redhat.com:/rhs/bricks/d2r 1-3-add 49158 Y 12608 Brick rhsauto035.lab.eng.blr.redhat.com:/rhs/bricks/d2r 2-3-add 49158 Y 12437 NFS Server on localhost 2049 Y 30804 Self-heal Daemon on localhost N/A Y 30811 Quota Daemon on localhost N/A Y 30818 NFS Server on rhsauto034.lab.eng.blr.redhat.com 2049 Y 12620 Self-heal Daemon on rhsauto034.lab.eng.blr.redhat.com N/A Y 12627 Quota Daemon on rhsauto034.lab.eng.blr.redhat.com N/A Y 12634 NFS Server on 10.70.37.7 2049 Y 22667 Self-heal Daemon on 10.70.37.7 N/A Y 22674 Quota Daemon on 10.70.37.7 N/A Y 22681 NFS Server on rhsauto035.lab.eng.blr.redhat.com 2049 Y 12449 Self-heal Daemon on rhsauto035.lab.eng.blr.redhat.com N/A Y 12456 Quota Daemon on rhsauto035.lab.eng.blr.redhat.com N/A Y 12463 Task ID Status ---- -- ------ Rebalance 5119a3b0-3e9f-479b-8ad9-9e413df4821f 1 Version-Release number of selected component (if applicable): glusterfs-rdma-3.4.0.20rhsquota2-1.el6rhs.x86_64 glusterfs-3.4.0.20rhsquota1-1.el6.x86_64 glusterfs-server-3.4.0.20rhsquota1-1.el6.x86_64 glusterfs-fuse-3.4.0.20rhsquota1-1.el6.x86_64 How reproducible: trying rebalance on this build gives this issue. Steps to Reproduce: 1. create a volume of 6x2 type, start it 2. enable quota 3. limit set of 100GB 4. mount over nfs 5. untar linux on the mount point Actual results: [root@rhsauto032 ~]# gluster volume rebalance dist-rep3 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 2398 347.4MB 16400 0 4330 in progress 1686.00 rhsauto034.lab.eng.blr.redhat.com 3946 595.9MB 11857 0 360 in progress 1686.00 rhsauto035.lab.eng.blr.redhat.com 0 0Bytes 25434 0 0 in progress 1686.00 rhsauto033.lab.eng.blr.redhat.com 0 0Bytes 25436 0 0 in progress 1685.00 volume rebalance: dist-rep3: success: but actually, on client the linux untar is hung, [root@rhsauto036 ~]# ps -auxww | grep tar Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ root 19217 0.7 0.0 116012 1228 pts/2 D+ 22:30 0:24 tar xvfj /opt/qa/tools/linux-2.6.31.1.tar.bz2 root 19276 0.0 0.0 103244 808 pts/0 S+ 23:23 0:00 grep tar Expected results: rebalance should not give issues. Additional info: from client, [root@rhsauto036 ~]# [root@rhsauto036 ~]# service iptables status iptables: Firewall is not running.
Could you attach the sosreport when the issue is seen?
Saurabh, Even I observed tar to be in D+ state. But it eventually completes. So, there is no frame loss. Also, if it were to be hung in a syscall, you would not be able to kill using SIGINT. So, I think its not a bug. I observed untar to be slow. Can you please give verbose option to tar and confirm that its not a hang in a system call? regards, Raghavendra.
tar succeeds with Build 3
Though I/O goes into D state eventually it finishes , so functionally it works . Marking as verified 3.4.0.33rhs-1.el6rhs.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1769.html