Bug 1299320

Summary: Detach tier fails to complete, on non local hosts
Product: Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: tierAssignee: Dan Lambright <dlambrig>
Status: CLOSED NOTABUG QA Contact: Bala Konda Reddy M <bmekala>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.1CC: kramdoss, nbalacha, nchilaka, rcyriac, rhs-bugs, rkavunga, sankarshan, smohan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: tier-attach-detach
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-18 12:44:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Nag Pavan Chilakam 2016-01-18 06:23:19 UTC
Description of problem:
=======================
On a tiered volume having in total about 100GB data, detach tier was issued. The detach tier status shows as completed on local host node, after about 2 hours, but doesnt complete at all on other nodes. Infact it doesnt even show a single file as migrated on the other nodes. After even completion of a day, it was still in same state and crash was found

Version-Release number of selected component (if applicable):
====
3.7.5-14

Comment 2 Nithya Balachandran 2016-01-18 07:26:53 UTC
Please provide gluster vol info details and sos-reports for the systems on which this was seen.

Comment 3 Dan Lambright 2016-01-18 17:22:46 UTC
Need steps to reproduce this problem.

Comment 4 Nag Pavan Chilakam 2016-01-20 11:12:15 UTC
Hi Dan, I have seen this atleast another 2 time on 3.5.7-16

steps:
1)create a disperse- distribute (ec) volume 
2)Now mount volume on two or more fuse clients
3) enabled quota
4) start IOs, on one untar lin-kernel on another dd create command for about 300MB files in loop for about 50 files
5) attach tier
6)now if previous IOs were complete, then re-start the same on different dir
7)while IOs are going on do a detach tier start
8)it can be seen that the detach tier status shows as complete on local node, but in progrss on other nodes.
Also, the vol status tasks, show detach as in progresss

Comment 6 Nag Pavan Chilakam 2016-01-20 13:20:21 UTC
sosreports @ 
[nchilaka@rhsqe-repo nchilaka]$ pwd
/home/repo/sosreports/nchilaka/bug.1300301

Comment 7 Mohammed Rafi KC 2016-01-20 16:22:30 UTC
I tried to reproduce the issue using the steps given in comment4. I couldn't reproduce it on my system for 2 times.

My setup.
4 node server
2 clients

Volume Name: patchy
Type: Tier
Volume ID: 5641dc88-58eb-44c6-b848-54795b32ed9c
Status: Started
Number of Bricks: 10
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: 10.70.42.212:/home/brick2/h2
Brick2: 10.70.43.110:/home/brick2/h1
Brick3: 10.70.43.100:/home/brick2/h1
Brick4: 10.70.42.212:/home/brick2/h1
Cold Tier:
Cold Tier Type : Disperse
Number of Bricks: 1 x (4 + 2) = 6
Brick5: 10.70.43.148:/home/brick1/c1
Brick6: 10.70.42.212:/home/brick1/c1
Brick7: 10.70.43.100:/home/brick1/c1
Brick8: 10.70.43.110:/home/brick1/c1
Brick9: 10.70.43.148:/home/brick1/c2
Brick10: 10.70.42.212:/home/brick1/c2

I will try couple of times more.

Comment 8 Nithya Balachandran 2016-01-21 11:01:59 UTC
Additional info here:

The crash reported in this BZ is the same as the one tracked by BZ# 1294774.

Comment 10 Mohammed Rafi KC 2016-02-18 11:31:25 UTC
Based on comment 8 , changing the description.

Comment 11 Nag Pavan Chilakam 2016-03-01 13:15:06 UTC
karthick can you kindly check if this is happening

Comment 12 krishnaram Karthick 2016-05-18 08:20:28 UTC
The issue reported is not seen in 3.1.3 builds. validated this in build - glusterfs-3.7.9-5.el7rhgs.x86_64

Comment 13 Dan Lambright 2016-06-07 12:44:01 UTC
Per comment 12, Can we close this?

Comment 15 krishnaram Karthick 2020-09-28 02:57:54 UTC
clearing stale needinfos.