1264804 – ECVOL: glustershd log grows quickly and fills up the root volume

Bug 1264804 - ECVOL: glustershd log grows quickly and fills up the root volume

Summary: ECVOL: glustershd log grows quickly and fills up the root volume

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	disperse
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.1.2
Assignee:	Ashish Pandey
QA Contact:	Neha
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1260783 1271358 1271967
TreeView+	depends on / blocked

Reported:	2015-09-21 09:04 UTC by RajeshReddy
Modified:	2016-09-17 15:03 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.7.5-6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1271358 (view as bug list)
Environment:
Last Closed:	2016-03-01 05:36:06 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:0193	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage 3.1 update 2	2016-03-01 10:20:36 UTC

Description RajeshReddy 2015-09-21 09:04:02 UTC

Description of problem:
==============
After running the self heal on ECVOL, glustershd end up consuming 46 GB in two days times and fills entire root volume 


Version-Release number of selected component (if applicable):
==============
glusterfs-api-3.7.1-14


Steps to Reproduce:
============
1.Create 3X(4+2) disperse volume, and create 100K files and un tar the linux kernal 
2. Run the script to bring down two of the bricks and keep populating the data for 30 min and then run the rebalance and self heal and then wait for 30 min and repeat it for 100 times 

Actual results:
===============
Self heal struck saying " remote operation failed. Path" Though file exist on the given client and logging the same messages in the log and ends up filling the root volume 


Expected results:
=============
Self heal should complete 


Additional info:
==================
[root@rhs-client39 glusterfs]# gluster vol status ECVOL4
Status of volume: ECVOL4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick1/ECVOL4                           49181     0          Y       28779
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick1/ECVOL4                            49189     0          Y       3131 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick2/ECVOL4                           49182     0          Y       28787
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick2/ECVOL4                            49190     0          Y       3039 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick3/ECVOL4                           49183     0          Y       28795
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick3/ECVOL4                            49191     0          Y       3111 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick4/ECVOL4                           49184     0          Y       28803
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick4/ECVOL4                            49192     0          Y       3103 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick5/ECVOL4                           49185     0          Y       28811
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick5/ECVOL4                            49193     0          Y       3088 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick6/ECVOL4                           49186     0          Y       28819
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick6/ECVOL4                            49194     0          Y       3130 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick4/ECVOL4_add1                      49203     0          Y       6373 
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick4/ECVOL4_add1                       49211     0          Y       3188 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick5/ECVOL4_add1                      49204     0          Y       6391 
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick5/ECVOL4_add1                       49212     0          Y       3194 
Brick rhs-client39.lab.eng.blr.redhat.com:/
rhs/brick6/ECVOL4_add1                      49205     0          Y       6409 
Brick rhs-client9.lab.eng.blr.redhat.com:/r
hs/brick6/ECVOL4_add1                       49213     0          Y       3199 
NFS Server on localhost                     2049      0          Y       19405
Self-heal Daemon on localhost               N/A       N/A        Y       19413
NFS Server on rhs-client9.lab.eng.blr.redha
t.com                                       N/A       N/A        N       N/A  
Self-heal Daemon on rhs-client9.lab.eng.blr
.redhat.com                                 N/A       N/A        Y       5009 
 
Task Status of Volume ECVOL4
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 85b7093c-5175-429d-892f-fbd39cf63876
Status               : in progress

Comment 1 RajeshReddy 2015-09-21 09:12:50 UTC

Logs are available @  /home/repo/sosreports/bug.1264804

Comment 6 Neha 2015-11-20 09:18:24 UTC

Steps to reproduce:

1] Create 4+2 EC vol , Mount on client node [fuse] and bring down a brick
2] Add 6 bricks [2 x (4 + 2)] and bring down a brick
3] Create 20 , 1M files and verify status of heal
4] run rebalance and Verify heal status
5] Add 6 more bricks [3 x (4 + 2)] and run rebalance

gluster v heal vol info is displaying correct information. Not displaying info for files which are migrated to the new bricks.

Attaching o/p.

Comment 10 errata-xmlrpc 2016-03-01 05:36:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html

Note You need to log in before you can comment on or make changes to this bug.