Bug 1456129
| Summary: | [Ganesha] : Ganesha dumps core on restarts,possible memory corruption. | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ambarish <asoman> |
| Component: | nfs-ganesha | Assignee: | Soumya Koduri <skoduri> |
| Status: | CLOSED ERRATA | QA Contact: | Ambarish <asoman> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.3 | CC: | amukherj, bturner, jthottan, kkeithle, mbenjamin, rhinduja, rhs-bugs, skoduri, storage-qa-internal |
| Target Milestone: | --- | Keywords: | Regression |
| Target Release: | RHGS 3.3.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | nfs-ganesha-2.4.4-8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-09-21 04:47:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1417151 | ||
I've got the steps to reproduce this : > While setting up the cluster,keep the volume in stopped state. > Enable Ganesha. > Start the volume at the end. The volume won't be exported on a few nodes..Restart Ganesha,it will dump a core. I'll fetch the logs you guys need post bumping up the log level. Ive setup Ganesha on stopped volumes..It had bugs in 3.2,which got fixed : https://bugzilla.redhat.com/show_bug.cgi?id=1393526. Marking this as Regression. From the bug description, we think that probably showmount was issued too quickly while the export is still being initialized. That may have resulted in empty list returned. When nfs-ganesha is stopped at the same time, as there is still an active export reference, it resulted in double free while unloading Gluster FSAL. Fix for the crash reported is posted upstream for review - https://review.gerrithub.io/#/c/364217/ But if the issue wrt 'showmount' not showing the exports (even after a delay) persists, please file a new bug. Thanks! Cannot be verified till the Test Blocker (https://bugzilla.redhat.com/show_bug.cgi?id=1460514) is fixed. Works fine on nfs-ganesha-2.4.4-10,Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2779 |
Description of problem: ----------------------- While freshly building a cluster on my physical machines,I saw that post gluster ganesha enable,and exporting my already existing volume via Ganesha,3/4 nodes did not show any exports via showmount. In an attempt to get the export,I restated Ganesha.It dumped a core .HEre's the backtrace : (gdb) bt #0 0x00007f829b3c21d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f829b3c38c8 in __GI_abort () at abort.c:90 #2 0x00007f829b401f07 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f829b50cb48 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #3 0x00007f829b409503 in malloc_printerr (ar_ptr=0x7f820c000020, ptr=<optimized out>, str=0x7f829b50cbb8 "double free or corruption (fasttop)", action=3) at malloc.c:5013 #4 _int_free (av=0x7f820c000020, p=<optimized out>, have_lock=0) at malloc.c:3835 #5 0x00007f829d82a091 in gsh_free (p=<optimized out>) at /usr/src/debug/nfs-ganesha-2.4.4/src/include/abstract_mem.h:271 #6 unregister_fsal (fsal_hdl=fsal_hdl@entry=0x7f8210ade3d0 <GlusterFS+112>) at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/fsal_manager.c:466 #7 0x00007f82108cdfa6 in glusterfs_unload () at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/main.c:172 #8 0x00007f829d5e085a in _dl_fini () at dl-fini.c:253 #9 0x00007f829b3c5a49 in __run_exit_handlers (status=status@entry=2, listp=0x7f829b7476c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:77 #10 0x00007f829b3c5a95 in __GI_exit (status=status@entry=2) at exit.c:99 #11 0x00007f829d8e6335 in Fatal () at /usr/src/debug/nfs-ganesha-2.4.4/src/log/log_functions.c:312 #12 0x00007f829d8e6f68 in display_log_component_level (component=COMPONENT_FSAL, file=0x7f82108d9670 "/builddir/build/BUILD/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/main.c", line=181, function=0x7f82108d9800 <__func__.23734> "glusterfs_unload", level=NIV_FATAL, format=<optimized out>, arguments=arguments@entry=0x7f8211adeed0) at /usr/src/debug/nfs-ganesha-2.4.4/src/log/log_functions.c:1514 #13 0x00007f829d8e6ffa in DisplayLogComponentLevel (component=component@entry=COMPONENT_FSAL, file=file@entry=0x7f82108d9670 "/builddir/build/BUILD/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/main.c", line=line@entry=181, function=function@entry=0x7f82108d9800 <__func__.23734> "glusterfs_unload", level=level@entry=NIV_FATAL, format=format@entry=0x7f82108d9738 "FSAL Gluster still contains active shares. Dying.. ") at /usr/src/debug/nfs-ganesha-2.4.4/src/log/log_functions.c:1688 #14 0x00007f82108ce02c in glusterfs_unload () at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/FSAL_GLUSTER/main.c:180 #15 0x00007f829d5e54b9 in _dl_close_worker (map=map@entry=0x7f820c0030a0) at dl-close.c:266 #16 0x00007f829d5e603c in _dl_close (_map=0x7f820c0030a0) at dl-close.c:776 #17 0x00007f829d5dfff4 in _dl_catch_error (objname=0x7f81cc007c40, errstring=0x7f81cc007c48, ---Type <return> to continue, or q <return> to quit--- mallocedp=0x7f81cc007c38, operate=0x7f829c936070 <dlclose_doit>, args=0x7f820c0030a0) at dl-error.c:177 #18 0x00007f829c9365bd in _dlerror_run (operate=operate@entry=0x7f829c936070 <dlclose_doit>, args=0x7f820c0030a0) at dlerror.c:163 #19 0x00007f829c93609f in __dlclose (handle=<optimized out>) at dlclose.c:47 #20 0x00007f829d82ddef in unload_fsal (fsal_hdl=0x7f8210ade3d0 <GlusterFS+112>) at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/default_methods.c:111 #21 0x00007f829d82f30d in destroy_fsals () at /usr/src/debug/nfs-ganesha-2.4.4/src/FSAL/fsal_destroyer.c:222 #22 0x00007f829d856ebf in do_shutdown () at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_admin_thread.c:446 #23 admin_thread (UnusedArg=<optimized out>) at /usr/src/debug/nfs-ganesha-2.4.4/src/MainNFSD/nfs_admin_thread.c:466 #24 0x00007f829bdb5dc5 in start_thread (arg=0x7f8211ae0700) at pthread_create.c:308 #25 0x00007f829b48473d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 (gdb) (gdb) Version-Release number of selected component (if applicable): ------------------------------------------------------------ nfs-ganesha-2.4.4-6.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-25.el7rhgs.x86_64 How reproducible: ----------------- Fairly Additional info: ---------------- Volume Name: testvol Type: Distributed-Replicate Volume ID: b7c40c38-fa47-4e18-b296-1bed9b963bd9 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0 Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1 Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2 Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3 Options Reconfigured: ganesha.enable: on features.cache-invalidation: on server.allow-insecure: on performance.stat-prefetch: off transport.address-family: inet nfs.disable: on nfs-ganesha: enable cluster.enable-shared-storage: enable [root@gqas013 tmp]#