Description of problem: IO hangs when a hot tier is attached to a disperse volume. Although IO resumes after a while, the duration of IO hang increases in proportion to the configuration (2x2 vs 4x2) of hot tier and the IO load. IO hangs for 2 minutes (avg) with 2x2 hot tier and kernel untar IO load. With 4x2 hot tier, IO hangs for more than 5 minutes. Volume Name: krk-vol Type: Tier Volume ID: 1bf70c56-56e1-4b9e-aeeb-e94a1fc42a28 Status: Started Number of Bricks: 16 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: 10.70.37.142:/bricks/brick4/ht1 Brick2: 10.70.37.153:/bricks/brick4/ht1 Brick3: 10.70.37.194:/bricks/brick4/ht1 Brick4: 10.70.37.182:/bricks/brick4/ht1 Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (4 + 2) = 12 Brick5: 10.70.37.182:/bricks/brick0/v1 Brick6: 10.70.37.194:/bricks/brick0/v1 Brick7: 10.70.37.153:/bricks/brick0/v1 Brick8: 10.70.37.142:/bricks/brick0/v1 Brick9: 10.70.37.114:/bricks/brick0/v1 Brick10: 10.70.37.86:/bricks/brick0/v1 Brick11: 10.70.37.182:/bricks/brick1/v1 Brick12: 10.70.37.194:/bricks/brick1/v1 Brick13: 10.70.37.153:/bricks/brick1/v1 Brick14: 10.70.37.142:/bricks/brick1/v1 Brick15: 10.70.37.114:/bricks/brick1/v1 Brick16: 10.70.37.86:/bricks/brick1/v1 Options Reconfigured: cluster.tier-mode: cache features.ctr-enabled: on features.quota-deem-statfs: on features.inode-quota: on features.quota: on ganesha.enable: on features.cache-invalidation: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable Version-Release number of selected component (if applicable): [root@dhcp37-114 ~]# rpm -qa | grep 'gluster' glusterfs-cli-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 glusterfs-server-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 python-gluster-3.8.3-0.1.git2ea32d9.el7.centos.noarch glusterfs-client-xlators-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 glusterfs-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 glusterfs-fuse-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 nfs-ganesha-gluster-next.20160813.2f47e8a-1.el7.centos.x86_64 glusterfs-libs-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 glusterfs-api-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 glusterfs-ganesha-3.8.3-0.1.git2ea32d9.el7.centos.x86_64 How reproducible: Always Steps to Reproduce: 1. create a distributed-disperse volume 2. enable quota and set limits 3. start IO and attach hot tier Actual results: Attach tier succeeds, IO hangs for a while Expected results: No IO hang should be seen with attach tier operation Additional info:
All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html
Could you please collect pkt trace and logs when IO hang is seen.
This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.
clearing stale needinfos.