on a 16 node setup, with ec volume, I started IOs from 3 different clients. While IOs were going on I attached a tier to the volume, and the IOs were hung. I tried this twice and both times IOs got hung. In 3.7.5-17 there used to be a temporary pause(about 5 min) when attach tier was issued. But in this build 3.7.5-19 the IOs have hung for more than 2 Hours volinfo before and after attach tier: gluster v create npcvol disperse 12 disperse-data 8 10.70.37.202:/bricks/brick1/npcvol 10.70.37.195:/bricks/brick1/npcvol 10.70.35.133:/bricks/brick1/npcvol 10.70.35.239:/bricks/brick1/npcvol 10.70.35.225:/bricks/brick1/npcvol 10.70.35.11:/bricks/brick1/npcvol 10.70.35.10:/bricks/brick1/npcvol 10.70.35.231:/bricks/brick1/npcvol 10.70.35.176:/bricks/brick1/npcvol 10.70.35.232:/bricks/brick1/npcvol 10.70.35.173:/bricks/brick1/npcvol 10.70.35.163:/bricks/brick1/npcvol 10.70.37.101:/bricks/brick1/npcvol 10.70.37.69:/bricks/brick1/npcvol 10.70.37.60:/bricks/brick1/npcvol 10.70.37.120:/bricks/brick1/npcvol 10.70.37.202:/bricks/brick2/npcvol 10.70.37.195:/bricks/brick2/npcvol 10.70.35.133:/bricks/brick2/npcvol 10.70.35.239:/bricks/brick2/npcvol 10.70.35.225:/bricks/brick2/npcvol 10.70.35.11:/bricks/brick2/npcvol 10.70.35.10:/bricks/brick2/npcvol 10.70.35.231:/bricks/brick2/npcvol gluster volume tier npcvol attach rep 2 10.70.35.176:/bricks/brick7/npcvol_hot 10.70.35.232:/bricks/brick7/npcvol_hot 10.70.35.173:/bricks/brick7/npcvol_hot 10.70.35.163:/bricks/brick7/npcvol_hot 10.70.37.101:/bricks/brick7/npcvol_hot 10.70.37.69:/bricks/brick7/npcvol_hot 10.70.37.60:/bricks/brick7/npcvol_hot 10.70.37.120:/bricks/brick7/npcvol_hot 10.70.37.195:/bricks/brick7/npcvol_hot 10.70.37.202:/bricks/brick7/npcvol_hot 10.70.35.133:/bricks/brick7/npcvol_hot 10.70.35.239:/bricks/brick7/npcvol_hot [root@dhcp37-202 ~]# gluster v status npcvol Status of volume: npcvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.37.202:/bricks/brick1/npcvol 49161 0 Y 628 Brick 10.70.37.195:/bricks/brick1/npcvol 49161 0 Y 30704 Brick 10.70.35.133:/bricks/brick1/npcvol 49158 0 Y 24148 Brick 10.70.35.239:/bricks/brick1/npcvol 49158 0 Y 24128 Brick 10.70.35.225:/bricks/brick1/npcvol 49157 0 Y 24467 Brick 10.70.35.11:/bricks/brick1/npcvol 49157 0 Y 24272 Brick 10.70.35.10:/bricks/brick1/npcvol 49160 0 Y 24369 Brick 10.70.35.231:/bricks/brick1/npcvol 49160 0 Y 32189 Brick 10.70.35.176:/bricks/brick1/npcvol 49161 0 Y 1392 Brick 10.70.35.232:/bricks/brick1/npcvol 49161 0 Y 26630 Brick 10.70.35.173:/bricks/brick1/npcvol 49161 0 Y 28493 Brick 10.70.35.163:/bricks/brick1/npcvol 49161 0 Y 28592 Brick 10.70.37.101:/bricks/brick1/npcvol 49161 0 Y 28410 Brick 10.70.37.69:/bricks/brick1/npcvol 49161 0 Y 357 Brick 10.70.37.60:/bricks/brick1/npcvol 49161 0 Y 31071 Brick 10.70.37.120:/bricks/brick1/npcvol 49176 0 Y 1311 Brick 10.70.37.202:/bricks/brick2/npcvol 49162 0 Y 651 Brick 10.70.37.195:/bricks/brick2/npcvol 49162 0 Y 30723 Brick 10.70.35.133:/bricks/brick2/npcvol 49159 0 Y 24167 Brick 10.70.35.239:/bricks/brick2/npcvol 49159 0 Y 24148 Brick 10.70.35.225:/bricks/brick2/npcvol 49158 0 Y 24486 Brick 10.70.35.11:/bricks/brick2/npcvol 49158 0 Y 24291 Brick 10.70.35.10:/bricks/brick2/npcvol 49161 0 Y 24388 Brick 10.70.35.231:/bricks/brick2/npcvol 49161 0 Y 32208 Snapshot Daemon on localhost 49163 0 Y 810 NFS Server on localhost 2049 0 Y 818 Self-heal Daemon on localhost N/A N/A Y 686 Quota Daemon on localhost N/A N/A Y 859 Snapshot Daemon on 10.70.37.101 49162 0 Y 28538 NFS Server on 10.70.37.101 2049 0 Y 28546 Self-heal Daemon on 10.70.37.101 N/A N/A Y 28439 Quota Daemon on 10.70.37.101 N/A N/A Y 28576 Snapshot Daemon on 10.70.37.195 49163 0 Y 30851 NFS Server on 10.70.37.195 2049 0 Y 30859 Self-heal Daemon on 10.70.37.195 N/A N/A Y 30751 Quota Daemon on 10.70.37.195 N/A N/A Y 30889 Snapshot Daemon on 10.70.37.120 49177 0 Y 1438 NFS Server on 10.70.37.120 2049 0 Y 1446 Self-heal Daemon on 10.70.37.120 N/A N/A Y 1339 Quota Daemon on 10.70.37.120 N/A N/A Y 1477 Snapshot Daemon on 10.70.37.69 49162 0 Y 492 NFS Server on 10.70.37.69 2049 0 Y 500 Self-heal Daemon on 10.70.37.69 N/A N/A Y 385 Quota Daemon on 10.70.37.69 N/A N/A Y 542 Snapshot Daemon on 10.70.37.60 49162 0 Y 31197 NFS Server on 10.70.37.60 2049 0 Y 31205 Self-heal Daemon on 10.70.37.60 N/A N/A Y 31099 Quota Daemon on 10.70.37.60 N/A N/A Y 31235 Snapshot Daemon on 10.70.35.239 49160 0 Y 24287 NFS Server on 10.70.35.239 2049 0 Y 24295 Self-heal Daemon on 10.70.35.239 N/A N/A Y 24176 Quota Daemon on 10.70.35.239 N/A N/A Y 24325 Snapshot Daemon on 10.70.35.231 49162 0 Y 32340 NFS Server on 10.70.35.231 2049 0 Y 32348 Self-heal Daemon on 10.70.35.231 N/A N/A Y 32236 Quota Daemon on 10.70.35.231 N/A N/A Y 32389 Snapshot Daemon on 10.70.35.176 49162 0 Y 1535 NFS Server on 10.70.35.176 2049 0 Y 1545 Self-heal Daemon on 10.70.35.176 N/A N/A Y 1420 Quota Daemon on 10.70.35.176 N/A N/A Y 1589 Snapshot Daemon on dhcp35-225.lab.eng.blr.r edhat.com 49159 0 Y 24623 NFS Server on dhcp35-225.lab.eng.blr.redhat .com 2049 0 Y 24631 Self-heal Daemon on dhcp35-225.lab.eng.blr. redhat.com N/A N/A Y 24514 Quota Daemon on dhcp35-225.lab.eng.blr.redh at.com N/A N/A Y 24661 Snapshot Daemon on 10.70.35.232 49162 0 Y 26759 NFS Server on 10.70.35.232 2049 0 Y 26767 Self-heal Daemon on 10.70.35.232 N/A N/A Y 26658 Quota Daemon on 10.70.35.232 N/A N/A Y 26805 Snapshot Daemon on 10.70.35.163 49162 0 Y 28721 NFS Server on 10.70.35.163 2049 0 Y 28729 Self-heal Daemon on 10.70.35.163 N/A N/A Y 28620 Quota Daemon on 10.70.35.163 N/A N/A Y 28760 Snapshot Daemon on 10.70.35.11 49159 0 Y 24427 NFS Server on 10.70.35.11 2049 0 Y 24435 Self-heal Daemon on 10.70.35.11 N/A N/A Y 24319 Quota Daemon on 10.70.35.11 N/A N/A Y 24465 Snapshot Daemon on 10.70.35.10 49162 0 Y 24521 NFS Server on 10.70.35.10 2049 0 Y 24529 Self-heal Daemon on 10.70.35.10 N/A N/A Y 24416 Quota Daemon on 10.70.35.10 N/A N/A Y 24560 Snapshot Daemon on 10.70.35.133 49160 0 Y 24314 NFS Server on 10.70.35.133 2049 0 Y 24322 Self-heal Daemon on 10.70.35.133 N/A N/A Y 24203 Quota Daemon on 10.70.35.133 N/A N/A Y 24352 Snapshot Daemon on 10.70.35.173 49162 0 Y 28625 NFS Server on 10.70.35.173 2049 0 Y 28633 Self-heal Daemon on 10.70.35.173 N/A N/A Y 28521 Quota Daemon on 10.70.35.173 N/A N/A Y 28671 Task Status of Volume npcvol ------------------------------------------------------------------------------ There are no active volume tasks #####after attach tier [root@dhcp37-202 ~]# gluster v status npcvol Status of volume: npcvol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Hot Bricks: Brick 10.70.35.239:/bricks/brick7/npcvol_ho t 49161 0 Y 25252 Brick 10.70.35.133:/bricks/brick7/npcvol_ho t 49161 0 Y 25276 Brick 10.70.37.202:/bricks/brick7/npcvol_ho t 49164 0 Y 2028 Brick 10.70.37.195:/bricks/brick7/npcvol_ho t 49164 0 Y 31793 Brick 10.70.37.120:/bricks/brick7/npcvol_ho t 49178 0 Y 2504 Brick 10.70.37.60:/bricks/brick7/npcvol_hot 49163 0 Y 32188 Brick 10.70.37.69:/bricks/brick7/npcvol_hot 49163 0 Y 1548 Brick 10.70.37.101:/bricks/brick7/npcvol_ho t 49163 0 Y 29535 Brick 10.70.35.163:/bricks/brick7/npcvol_ho t 49163 0 Y 29799 Brick 10.70.35.173:/bricks/brick7/npcvol_ho t 49163 0 Y 29669 Brick 10.70.35.232:/bricks/brick7/npcvol_ho t 49163 0 Y 27813 Brick 10.70.35.176:/bricks/brick7/npcvol_ho t 49163 0 Y 2607 Cold Bricks: Brick 10.70.37.202:/bricks/brick1/npcvol 49161 0 Y 628 Brick 10.70.37.195:/bricks/brick1/npcvol 49161 0 Y 30704 Brick 10.70.35.133:/bricks/brick1/npcvol 49158 0 Y 24148 Brick 10.70.35.239:/bricks/brick1/npcvol 49158 0 Y 24128 Brick 10.70.35.225:/bricks/brick1/npcvol 49157 0 Y 24467 Brick 10.70.35.11:/bricks/brick1/npcvol 49157 0 Y 24272 Brick 10.70.35.10:/bricks/brick1/npcvol 49160 0 Y 24369 Brick 10.70.35.231:/bricks/brick1/npcvol 49160 0 Y 32189 Brick 10.70.35.176:/bricks/brick1/npcvol 49161 0 Y 1392 Brick 10.70.35.232:/bricks/brick1/npcvol 49161 0 Y 26630 Brick 10.70.35.173:/bricks/brick1/npcvol 49161 0 Y 28493 Brick 10.70.35.163:/bricks/brick1/npcvol 49161 0 Y 28592 Brick 10.70.37.101:/bricks/brick1/npcvol 49161 0 Y 28410 Brick 10.70.37.69:/bricks/brick1/npcvol 49161 0 Y 357 Brick 10.70.37.60:/bricks/brick1/npcvol 49161 0 Y 31071 Brick 10.70.37.120:/bricks/brick1/npcvol 49176 0 Y 1311 Brick 10.70.37.202:/bricks/brick2/npcvol 49162 0 Y 651 Brick 10.70.37.195:/bricks/brick2/npcvol 49162 0 Y 30723 Brick 10.70.35.133:/bricks/brick2/npcvol 49159 0 Y 24167 Brick 10.70.35.239:/bricks/brick2/npcvol 49159 0 Y 24148 Brick 10.70.35.225:/bricks/brick2/npcvol 49158 0 Y 24486 Brick 10.70.35.11:/bricks/brick2/npcvol 49158 0 Y 24291 Brick 10.70.35.10:/bricks/brick2/npcvol 49161 0 Y 24388 Brick 10.70.35.231:/bricks/brick2/npcvol 49161 0 Y 32208 Snapshot Daemon on localhost 49163 0 Y 810 NFS Server on localhost 2049 0 Y 2048 Self-heal Daemon on localhost N/A N/A Y 2056 Quota Daemon on localhost N/A N/A Y 2064 Snapshot Daemon on 10.70.37.60 49162 0 Y 31197 NFS Server on 10.70.37.60 2049 0 Y 32208 Self-heal Daemon on 10.70.37.60 N/A N/A Y 32216 Quota Daemon on 10.70.37.60 N/A N/A Y 32224 Snapshot Daemon on 10.70.37.195 49163 0 Y 30851 NFS Server on 10.70.37.195 2049 0 Y 31813 Self-heal Daemon on 10.70.37.195 N/A N/A Y 31821 Quota Daemon on 10.70.37.195 N/A N/A Y 31829 Snapshot Daemon on 10.70.37.120 49177 0 Y 1438 NFS Server on 10.70.37.120 2049 0 Y 2524 Self-heal Daemon on 10.70.37.120 N/A N/A Y 2532 Quota Daemon on 10.70.37.120 N/A N/A Y 2540 Snapshot Daemon on 10.70.37.101 49162 0 Y 28538 NFS Server on 10.70.37.101 2049 0 Y 29555 Self-heal Daemon on 10.70.37.101 N/A N/A Y 29563 Quota Daemon on 10.70.37.101 N/A N/A Y 29571 Snapshot Daemon on 10.70.37.69 49162 0 Y 492 NFS Server on 10.70.37.69 2049 0 Y 1574 Self-heal Daemon on 10.70.37.69 N/A N/A Y 1582 Quota Daemon on 10.70.37.69 N/A N/A Y 1590 Snapshot Daemon on 10.70.35.173 49162 0 Y 28625 NFS Server on 10.70.35.173 2049 0 Y 29690 Self-heal Daemon on 10.70.35.173 N/A N/A Y 29698 Quota Daemon on 10.70.35.173 N/A N/A Y 29713 Snapshot Daemon on 10.70.35.231 49162 0 Y 32340 NFS Server on 10.70.35.231 2049 0 Y 1022 Self-heal Daemon on 10.70.35.231 N/A N/A Y 1033 Quota Daemon on 10.70.35.231 N/A N/A Y 1043 Snapshot Daemon on 10.70.35.176 49162 0 Y 1535 NFS Server on 10.70.35.176 2049 0 Y 2627 Self-heal Daemon on 10.70.35.176 N/A N/A Y 2635 Quota Daemon on 10.70.35.176 N/A N/A Y 2659 Snapshot Daemon on 10.70.35.239 49160 0 Y 24287 NFS Server on 10.70.35.239 2049 0 Y 25272 Self-heal Daemon on 10.70.35.239 N/A N/A Y 25280 Quota Daemon on 10.70.35.239 N/A N/A Y 25288 Snapshot Daemon on dhcp35-225.lab.eng.blr.r edhat.com 49159 0 Y 24623 NFS Server on dhcp35-225.lab.eng.blr.redhat .com 2049 0 Y 25622 Self-heal Daemon on dhcp35-225.lab.eng.blr. redhat.com N/A N/A Y 25630 Quota Daemon on dhcp35-225.lab.eng.blr.redh at.com N/A N/A Y 25638 Snapshot Daemon on 10.70.35.11 49159 0 Y 24427 NFS Server on 10.70.35.11 2049 0 Y 25455 Self-heal Daemon on 10.70.35.11 N/A N/A Y 25463 Quota Daemon on 10.70.35.11 N/A N/A Y 25471 Snapshot Daemon on 10.70.35.133 49160 0 Y 24314 NFS Server on 10.70.35.133 2049 0 Y 25296 Self-heal Daemon on 10.70.35.133 N/A N/A Y 25304 Quota Daemon on 10.70.35.133 N/A N/A Y 25312 Snapshot Daemon on 10.70.35.10 49162 0 Y 24521 NFS Server on 10.70.35.10 2049 0 Y 25578 Self-heal Daemon on 10.70.35.10 N/A N/A Y 25586 Quota Daemon on 10.70.35.10 N/A N/A Y 25594 Snapshot Daemon on 10.70.35.232 49162 0 Y 26759 NFS Server on 10.70.35.232 2049 0 Y 27833 Self-heal Daemon on 10.70.35.232 N/A N/A Y 27841 Quota Daemon on 10.70.35.232 N/A N/A Y 27866 Snapshot Daemon on 10.70.35.163 49162 0 Y 28721 NFS Server on 10.70.35.163 2049 0 Y 29819 Self-heal Daemon on 10.70.35.163 N/A N/A Y 29827 Quota Daemon on 10.70.35.163 N/A N/A Y 29852 Task Status of Volume npcvol ------------------------------------------------------------------------------ Task : Tier migration ID : 524ad8fe-a743-47df-a4e9-edd2db05c60b Status : in progress Following is the Ios triggered before attach and were going on while attach: 1)client1:created a 300Mb file and started to copy the file to new files for i in {2..50};do cp hlfile.1 hlfile.$i;done 2)client2:created 50Mb file and initiated a rename of file continuously for i in {2..1000};do cp rename.1 rename.$i;done 3)client3: linux untar 4)copying a 3GB file to create new files in loop for i in {1..10};do cp File.mkv cheema$i.mkv;done 4)Client 4: created 10000 Zerobyte file and while then triggered remove of 5000 file so that it goes on while attach tier [root@rhs-client30 zerobyte]# rm -rf zb{5000..10000}
sosreports of both clients and servers available at [nchilaka@rhsqe-repo nchilaka]$ chmod -R 0777 bug.1306194 [nchilaka@rhsqe-repo nchilaka]$ pwd /home/repo/sosreports/nchilaka
There is a blocking lock held on one of the brick, which is not released. All of the other clients are waiting on this lock. We couldn't look into the owner of the lock, because by the time ping timer is expired and lock was released. After that i/o's resumed. We need to look which client acquired the lock and why they are not releasing it.
When we tried to reproduce the issue, we see "Stale File Handle" errors after attach-tier. When did RCA using gdb, we found that ESTALE is returned via svc_client (which is enabled by USS). So we have disabled USS and then re-tried the test. Now we see the mount points hang. On the server side, the volume got unexported - [skoduri@skoduri ~]$ showmount -e 10.70.35.225 Export list for 10.70.35.225: [skoduri@skoduri ~]$ Tracing back from the logs and the code, [2016-02-11 13:26:02.540565] E [MSGID: 112070] [nfs3.c:896:nfs3_getattr] 0-nfs-nfsv3: Volume is disabled: finalvol [2016-02-11 13:28:02.600425] E [MSGID: 112070] [nfs3.c:896:nfs3_getattr] 0-nfs-nfsv3: Volume is disabled: finalvol [2016-02-11 13:28:02.600546] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully This msg is logged when that volume is not in the list of nfs->initedxl[] list. This list will be updated as part of "nfs_startup_subvolume()" which is invoked during notify of "GF_EVENT_CHILD_UP". So suspecting that nfs xlator has not received this event which resulted in this volume being in unexported state. Attaching the nfs log for further debugging.
During the nfs graph initialization, we do a lookup on the root. Looks like this lookup is blocked on a lock which held by another nfs process. We need to figure it out why the nfs server who acquired the lock failed to unlock it.
Rafi reported that stale lock or unlock failures are seen even when first lookup on root is happening. Here is a most likely RCA. I am assuming a "tier-dht" has two dht subvols "hot-dht" and "cold-dht". Also stale lock is found on one of the bricks corresponding to hot-dht. 1. Lookup on / on tier-dht. 2. Lookup is wound to hashed subvol - cold-dht and is successful. 3. tier-dht figures out / is a directory and does a lookup on both hot-dht and cold-dht. 4. on hot-dht, some subvols - say c1, c2 - are down. But lookup is still successful as some other subvols (say c3, c4) are up. 5. lookup on / is successful on cold-dht. 6. tier-dht decides it needs to heal layout of "/". From here I am skipping events on cold-dht as they are irrelevant for this RCA. 7. tier-dht winds inodelk on hot-dht. hot-dht winds it to first subvol in the layout-list (Say c1 in this case). Note that subvols with 0 ranges are stored in the beginning of the list. All the subvols where lookup failed (say because of ENOTCONN) ends up with 0 ranges. The relative order of subvols with 0 ranges is undefined and depends on whose lookup failed first. 8. c1 comes up 9. hot-dht acquires lock on c1. 10. tier-dht tries to refresh its layout of /. Winds lookup on hot and cold dhts again. 11. hot-dht sees that layout's generation number is lagging behind current generation number (as c1 came after lookup on / completed). It issues a fresh lookup and reconstructs the layout for /. Since c2 is still down, it is pushed to the beginning of the subvol list of layout. 12. tier-dht is done with healing. It issues unlock on hot-dht. 13. hot-dht winds unlock call to first subvol in layout of /, which is c2. 14. unlock fails with ENOTCONN and a stale lock is left on c1.
steps 7 and 8 can be swapped for more clarity and RCA is still valid
Yes, I have included this as part of the bug 1303045 .
Workaround testing: I tested the work around by restarting volume using force. While the IOs resumed, which means the workaround is fine, but there is a small problem which has been discussed for which bz#1309186 - file creates fail with " failed to open '<filename>': Too many levels of symbolic links for file create/write when restarting NFS using vol start force has been raised
upstream patch : http://review.gluster.org/#/c/13492/
upstream master patch : http://review.gluster.org/#/c/13492/ upstream 3.7 patch : http://review.gluster.org/#/c/14236/ downstream patch : https://code.engineering.redhat.com/gerrit/73806
IO hang during attach tier on NFS mount has not been seen so far during the regression tests. Moving the bug to verified.
Looks perfect to me.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240