Created attachment 1014624 [details] core file Description of problem: For export/netgroup feature we will run a thread in background which will fetch/refresh contents of export/netgroup file. This thread is not cleaned up properly Version-Release number of selected component (if applicable): 3.7 How reproducible: rarely Actual results: coredump is generated. Expected results: Should reach such race conditions Additional info:
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in th regression failure) posted (#1) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in the regression failure) posted (#2) for review on master by Humble Devassy Chirammal (humble.devassy)
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in the regression failure) posted (#3) for review on master by Vijay Bellur (vbellur)
From my understanding and talking to Jiffin, the refresh-thread is active while the nfs-server is exiting. This causes a (partial) cleanup of some of the structures, which are later on accessed again, resulting in the segfault. I see two options to fix this permanently, but need to look into the details to judge what fix is most reasonable: 1. replace the dict_t structures by rcu-lists 2. lock the structures while the refresh thread is running and stop the thread (+wait) on shutting down the nfs-server
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in the regression failure) posted (#4) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in the regression) posted (#5) for review on master by jiffin tony Thottan (jthottan)
REVIEW: http://review.gluster.org/10250 (nfs : fix for coredump caused by export/netgroup feature in the regression) posted (#6) for review on master by Vijay Bellur (vbellur)
Fix for this bug is already made in a GlusterFS release. The cloned BZ has details of the fix and the release. Hence closing this mainline BZ.
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user