Bug 1192075

Summary: libgfapi clients hang if glfs_fini is called before glfs_init
Product: [Community] GlusterFS Reporter: Niels de Vos <ndevos>
Component: libgfapiAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact: Sudhir D <sdharane>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.3CC: bharata.rao, bugs, c.affolter, pgurusid, sankarshan, sdharane
Target Milestone: ---Keywords: Patch, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1091335 Environment:
Last Closed: 2016-06-17 15:57:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1091335    
Bug Blocks: 1317021    

Description Niels de Vos 2015-02-12 15:19:54 UTC
+++ This bug was initially created as a clone of Bug #1091335 +++
+++                                                           +++
+++ Use this bug to backport the fix to release-3.5.          +++

Description of problem:

(Taken from http://lists.gnu.org/archive/html/gluster-devel/2014-04/msg00179.html)

Hi,

In QEMU, we initialize gfapi in the following manner:

********************
glfs = glfs_new();
if (!glfs)
   goto out;
if (glfs_set_volfile_server() < 0)
   goto out;
if (glfs_set_logging() < 0)
   goto out;
if (glfs_init(glfs))
   goto out;

...

out:
if (glfs)
   glfs_fini(glfs)
*********************

Now if either glfs_set_volfile_server() or glfs_set_logging() fails, we end up doing glfs_fini() which eventually hangs in glfs_lock().

#0  0x00007ffff554a595 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffff79d312e in glfs_lock (fs=0x555556331310) at glfs-internal.h:176
#2  0x00007ffff79d5291 in glfs_active_subvol (fs=0x555556331310) at glfs-resolve.c:811
#3  0x00007ffff79c9f23 in glfs_fini (fs=0x555556331310) at glfs.c:753

Note that we haven't done glfs_init() in this failure case.

- Is this failure expected ? If so, what is the recommended way of releasing the glfs object ?
- Does glfs_fini() depend on glfs_init() to have worked successfully ?
- Since QEMU-GlusterFS driver was developed when libgfapi was very new, can gluster developers take a look at the order of the glfs_* calls we are making in QEMU and suggest any changes, improvements or additions now given that libgfapi has seen a lot of development ?

Regards,
Bharata.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Poornima G on 2014-05-02 09:32:22 CEST ---



--- Additional comment from Anand Avati on 2014-05-23 09:38:39 CEST ---

REVIEW: http://review.gluster.org/7857 (glfs_fini: Fix a possible hang in glfs_fini.) posted (#1) for review on master by Poornima G (pgurusid)

--- Additional comment from Anand Avati on 2014-06-02 20:09:11 CEST ---

COMMIT: http://review.gluster.org/7857 committed in master by Anand Avati (avati) 
------
commit a96350fa2b68626b8592d5cbd67405e4d8416cca
Author: Poornima G <pgurusid>
Date:   Fri May 23 12:58:56 2014 +0530

    glfs_fini: Fix a possible hang in glfs_fini.
    
    glfs_fini is called when there is a failure in glfs_new,
    glfs_init etc. If an application sees a failure in glfs_new
    and calls glfs_fini, it will result in hang in glfs_fini.
    
    Fixed the hang.
    
    Change-Id: I80b52cd76d1d7f3fe9a10a91b7226d54176a8982
    BUG: 1091335
    Signed-off-by: Poornima G <pgurusid>
    Reviewed-on: http://review.gluster.org/7857
    Reviewed-by: soumya k <skoduri>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 1 Niels de Vos 2015-09-15 14:09:55 UTC
There were no patches submitted in time for the glusterfs-3.5.6 release that should resolve this bug. This bug report is moved for tracking to the glusterfs-3.5.7 release, submitting patches/backports is very much appreciated.

Comment 2 Niels de Vos 2016-06-17 15:57:15 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.