copying from https://bugzilla.redhat.com/show_bug.cgi?id=1400780#c9
From BT of core 1 in bz#1400780 and core 2 in bz#1401160 , it is clear that issue will hit only when ganesha is trying to remove a entry from its lru list. By default lru limit for ganesha's MD_CACHE is 25000 and in gfapi layer it is 131072. We suspect crashed occurred when there is race b/w removal of entry from ganesha and gluster layer.
I tried to reproduce similar issue with 3 volumes(two 1x2 and one 1x1) and clients no varying from 4 to 7. Also I tried with lower value for lru limit to 20 for ganesha and 100 for gluster. But never hit this with ongoing I/O's (ran dd and linux untar from different clients). In my setup the I/O continuously ran for atleast 4 hours, then it error out saying "no space left on the device".
But during clean up (rm -rf on same directories from different mount) I have consistently got crash with a similar BT during lru clean up. The crashes are more easily reproduced with lower lru limit value. When I increased the lru value to 150000 in ganesha, crash was not seen(may be it will crash eventually)
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHEA-2017-0493.html