Description of problem: ----------------------- While repeating volume set operation in the loop, glusterd gets OOM killed Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHS 3.0.4 Nightly build ( glusterfs-3.6.0.48-1.el6rhs ) How reproducible: ----------------- Consistent Steps to Reproduce: ------------------- 1. Create a 2 node 'Trusted Storage Pool' ( cluster ) 2. Create 2 distribute volumes with 2 bricks ( 1 brick per node ) and start it 3. From NODE1, in one loop, keep repeating 'volume set' on one volume (say vol1) 4. Frome NODE2, in other loop, keep repeating 'volume set' on other volume ( say vol2 ) Actual results: --------------- After an hour, glusterd got OOM killed. Expected results: ----------------- glusterd should get OOM Killed
Created attachment 998243 [details] glusterd statedump file glusterd statedump file while the memory consumption was high enough. This was taken 10 minutes before glusterd got OOM Killed
Created attachment 998246 [details] sosreport from the machine sosreport as taken from NODE1, where the glusterd got OOM killed
Reproducible test case1 : -------------------------- 0. Create a 2 node cluster 1. Create a distribute volume and start it. 2. Create a shell script as follows : while true; do gluster volume set <vol-name1> read-ahead on; done 3. Run the above script in the background. 4. From RHSS command line execute the following, while true; do gluster volume set <vol-name2> write-behind on; done 5. Monitor the memory consumed by glusterd
Setting the volume option repeatedly in a loop is neither a viable usecase nor any customer would do that in loop for that prolonged time. Not raising this bug as a BLOCKER based on that fact.
Moving back to Post state as this bug doesn't have all the acks to get to the ON_QA state.
As this bug has got all the acks now, moving it to ON_QA
Tested with RHGS 3.1 Nightly build ( glusterfs-3.7.1-6.el6rhs ) Still the memory consumption of glusterd seems to hike, when performing operation as mentioned in comment3 I have captured memory consumed by glusterd at various point of times : PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9181 root 20 0 648m 51m 3984 S 20.6 0.6 0:10.72 glusterd 9181 root 20 0 2696m 2.1g 3984 S 53.8 27.0 16:53.95 glusterd 9181 root 20 0 3230m 2.6g 4344 S 43.8 33.4 23:39.32 glusterd 9181 root 20 0 3592m 3.0g 3984 S 50.5 38.0 29:07.59 glusterd 9181 root 20 0 4040m 3.4g 3984 S 54.8 43.6 36:59.16 glusterd 9181 root 20 0 4616m 4.0g 3984 S 47.8 51.2 49:08.89 glusterd Marking this bug as FailedQA, as the memory usage of glusterd is still growing with volume set operations in loop
We are not planning to fix this sooner since this is a use case which wouldn't be tried out in production setup. Considering that I am closing this bug. Please feel to reopen if you think otherwise.