Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1238118 - nfs-ganesha: coredump for ganesha process post executing the volume start twice
nfs-ganesha: coredump for ganesha process post executing the volume start twice
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: nfs-ganesha (Show other bugs)
3.1
x86_64 Linux
unspecified Severity urgent
: ---
: RHGS 3.1.1
Assigned To: Jiffin
Apeksha
: ZStream
Depends On:
Blocks: 1216951 1251815
  Show dependency treegraph
 
Reported: 2015-07-01 05:00 EDT by Saurabh
Modified: 2016-01-19 01:15 EST (History)
14 users (show)

See Also:
Fixed In Version: glusterfs-3.7.1-12
Doc Type: Bug Fix
Doc Text:
Previously, when DBus signals were sent multiple times in succession for a volume that is already exported, caused NFS-Ganesha service crash. With this fix, NFS-Ganesha service does not crash.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-05 03:17:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
coredump of nfs-ganesha process (4.41 MB, application/x-xz)
2015-07-01 05:00 EDT, Saurabh
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1845 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.1 update 2015-10-05 07:06:22 EDT

  None (edit)
Description Saurabh 2015-07-01 05:00:21 EDT
Created attachment 1044954 [details]
coredump of nfs-ganesha process

Description of problem:
I tried to execute volume start twice and saw a nfs-ganesha coredump.

(gdb) bt
#0  0x00000030ba632625 in raise () from /lib64/libc.so.6
#1  0x00000030ba633e05 in abort () from /lib64/libc.so.6
#2  0x00000030ba62b74e in __assert_fail_base () from /lib64/libc.so.6
#3  0x00000030ba62b810 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000051a2c1 in free_export ()
#5  0x00000000005070b9 in export_init ()
#6  0x0000000000534597 in proc_block ()
#7  0x000000000053526d in load_config_from_node ()
#8  0x000000000051c393 in gsh_export_addexport ()
#9  0x000000000052ed50 in dbus_message_entrypoint ()
#10 0x00000030bda1cefe in ?? () from /lib64/libdbus-1.so.3
#11 0x00000030bda10b4c in dbus_connection_dispatch () from /lib64/libdbus-1.so.3
#12 0x00000030bda10dd9 in ?? () from /lib64/libdbus-1.so.3
#13 0x000000000052f913 in gsh_dbus_thread ()
#14 0x00000030baa07a51 in start_thread () from /lib64/libpthread.so.0
#15 0x00000030ba6e896d in clone () from /lib64/libc.so.6


Version-Release number of selected component (if applicable):
glusterfs-3.7.1-6.el6rhs.x86_64
nfs-ganesha-2.2.0-3.el6rhs.x86_64

How reproducible:
always


Actual results:
coredump as mentioned with description section.

ganesha.log,
01/07/2015 13:59:32 : epoch 559386bb : nfs11 : ganesha.nfsd-13136[dbus_heartbeat] glusterfs_create_export :FSAL :EVENT :Volume vol2 exported at : '/'
01/07/2015 13:59:37 : epoch 559386bb : nfs11 : ganesha.nfsd-13136[dbus_heartbeat] export_commit_common :CONFIG :CRIT :Pseudo path (/vol2) is a duplicate
01/07/2015 13:59:37 : epoch 559386bb : nfs11 : ganesha.nfsd-13136[dbus_heartbeat] export_commit_common :CONFIG :CRIT :Duplicate export id = 12


Expected results:
volume start is not suppose to cause issue with nfs-ganesha process.

Additional info:
Comment 2 Apeksha 2015-07-02 08:40:12 EDT
Hit the same backtrace again, but with different scenario:
(gdb) bt
#0  0x0000003a9dc32625 in raise () from /lib64/libc.so.6
#1  0x0000003a9dc33e05 in abort () from /lib64/libc.so.6
#2  0x0000003a9dc2b74e in __assert_fail_base () from /lib64/libc.so.6
#3  0x0000003a9dc2b810 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000051a2c1 in free_export ()
#5  0x00000000005070b9 in export_init ()
#6  0x0000000000534597 in proc_block ()
#7  0x000000000053526d in load_config_from_node ()
#8  0x000000000051c393 in gsh_export_addexport ()
#9  0x000000000052ed50 in dbus_message_entrypoint ()
#10 0x0000003aa041cefe in ?? () from /lib64/libdbus-1.so.3
#11 0x0000003aa0410b4c in dbus_connection_dispatch () from /lib64/libdbus-1.so.3
#12 0x0000003aa0410dd9 in ?? () from /lib64/libdbus-1.so.3
#13 0x000000000052f913 in gsh_dbus_thread ()
#14 0x0000003a9e007a51 in start_thread () from /lib64/libpthread.so.0
#15 0x0000003a9dce896d in clone () from /lib64/libc.so.6


Steps:
Run an automated test for selfheal:

1. Create a 6x2 dit-rep volume, enable ganesha and mount it
2. While creating some directories/files, kill 1 brick process from each of replica-pair
3. Allow I/O to complete and start self-heal
4. Self-heal completes successfully
5. Create a new volume again, mount fails
6. Ganesha process crashes on all the nodes


Not raising a new bug since the backtrace is same
Comment 3 Soumya Koduri 2015-07-06 06:01:00 EDT
This issue is fixed as part of bug1237053
Comment 4 Meghana 2015-07-06 06:07:31 EDT
This issue is fixed and it works only when SELinux is in permissive mode
Comment 6 Meghana 2015-07-09 03:28:02 EDT
With the SElinux workaround, Apeksha has been able to verify the bug. Keeping the bug state the same until the SElinux fix is available in a build for RHEL6.7
Comment 7 Niels de Vos 2015-07-09 03:41:34 EDT
(In reply to Soumya Koduri from comment #3)
> This issue is fixed as part of bug1237053

Can this be closed as a duplicate?
Comment 8 Apeksha 2015-07-09 04:26:45 EDT
In enforcing mode after using the workaround mentioned in the bug -https://bugzilla.redhat.com/show_bug.cgi?id=1239017 , i dont see any avc showmount errors.
Also tried executing volume force multiple times, ganesha process dint crash
Comment 9 Meghana 2015-07-13 02:11:24 EDT
The fix is already available downstream but it does not work when SElinux is in enforcing mode. It works fine in permissive mode. The fix will be available when we have the next SElinux build.
Comment 10 monti lawrence 2015-07-23 10:45:50 EDT
Doc text is edited. Please sign off to be included in Known Issues.
Comment 11 Soumya Koduri 2015-07-27 05:29:56 EDT
Updated the doc text. Kindly verify the same.
Comment 12 Anjana Suparna Sriram 2015-07-27 14:24:00 EDT
Doc text is edited. Please sign off to be included in Known Issues.
Comment 13 Soumya Koduri 2015-07-28 03:36:32 EDT
doc text looks good to me.
Comment 15 Rejy M Cyriac 2015-08-10 00:13:28 EDT
Attached to RHGS 3.1 Update 1 (z-stream) Tracker BZ
Comment 16 Meghana 2015-08-13 03:24:21 EDT
Since the SElinux fixes are available, moving it to ON_QA.
Comment 17 Apeksha 2015-08-27 02:57:31 EDT
Verified on glusterfs-3.7.1-12.el7rhgs.x86_64
Comment 19 Divya 2015-09-29 02:02:39 EDT
Jiffin,

Could you review and sign-off the edited doc text?
Comment 20 Jiffin 2015-09-29 02:24:14 EDT
Looks good to me, verified
Comment 22 errata-xmlrpc 2015-10-05 03:17:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html

Note You need to log in before you can comment on or make changes to this bug.