Bug 1417915 - Hangs on 32 bit systems since 3.9.0
Summary: Hangs on 32 bit systems since 3.9.0
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: md-cache
Version: 3.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On: 1417913
Blocks: glusterfs-3.10.0 1422364
TreeView+ depends on / blocked
 
Reported: 2017-01-31 12:21 UTC by Niels de Vos
Modified: 2017-03-06 17:44 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1417913
: 1422364 (view as bug list)
Environment:
Last Closed: 2017-02-27 15:31:04 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2017-01-31 12:21:48 UTC
+++ This bug was initially created as a clone of Bug #1417913 +++

+++ This bug was initially created as a clone of Bug #1416684 +++

Description of problem:
Since 3.9.0 command
/usr/sbin/glusterfs --debug --volfile-server=SERVER --volfile-id=VOLID /mount/path
hangs on 32 bit system

Additional info:

It hangs on INCREMENT_ATOMIC from this commit:
commit 3cc7f6588c281846f8c590553da03dd16f150e8a
Author: Poornima G <pgurusid>
Date:   Wed Aug 17 12:55:37 2016 +0530

    md-cache: Add cache hit and miss counters
    
    These counters can be accessed either by .meta interface
    or statedump.
    
    From meta: cat on the private file in md-cache directory.
    Eg: cat /mnt/glusterfs/0/.meta/graphs/active/patchy-md-cache/private
    [performance/md-cache.patchy-md-cache]


         if (xdata) {
                 ret = mdc_inode_xatt_get (this, loc->inode, &xattr_rsp);
-                if (ret != 0)
+                if (ret != 0) {
+                        INCREMENT_ATOMIC (conf->mdc_counter.lock,
+                                          conf->mdc_counter.xattr_miss);
                         goto uncached;
+                }
 

comment out "!defined(__i386__)" in INCREMENT_ATOMIC definition:

+#if (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 1)) && !defined(__i386__)
+# define INCREMENT_ATOMIC(lk, op) __sync_add_and_fetch(&op, 1)

fix the problem.

I see two problem here:
1. incorrect workaround for INCREMENT_ATOMIC
2. incorrect detection of RHEL 5 32bit

Backtrace:
#0  0xb7e55362 in pthread_spin_lock () from /lib/libpthread.so.0
(gdb) bt
#0  0xb7e55362 in pthread_spin_lock () from /lib/libpthread.so.0
#1  0xb3541ffd in mdc_lookup (frame=0xb6df2300, this=0xb37165b8, loc=0xb03fd220, xdata=0xb66417b4) at md-cache.c:1085
#2  0xb350e3e5 in io_stats_lookup (frame=0xb6df2218, this=0xb3717680, loc=0xb03fd220, xdata=0xb66417b4) at io-stats.c:2617
#3  0xb7f3c225 in default_lookup (frame=0xb6df2218, this=0xb3702f60, loc=0xb03fd220, xdata=0xb66417b4) at defaults.c:2572
#4  0xb34fb63a in meta_lookup (frame=0xb6df2218, this=0xb3702f60, loc=0xb03fd220, xdata=0xb66417b4) at meta.c:44
#5  0xb63282d1 in fuse_first_lookup (this=0x8089400) at fuse-bridge.c:4262

--- Additional comment from Vitaly Lipatov on 2017-01-26 10:43:20 CET ---

LOCK hangs due missed LOCK_INIT == missed pthread_spin_init == use gf_lock_t lock uninitialized with zero;

--- Additional comment from Vitaly Lipatov on 2017-01-26 11:04:25 CET ---

fix:

--- a/xlators/performance/md-cache/src/md-cache.c
+++ b/xlators/performance/md-cache/src/md-cache.c
@@ -2905,6 +2905,7 @@ init (xlator_t *this)
         GF_OPTION_INIT("cache-invalidation", conf->mdc_invalidation, bool, out);
 
         LOCK_INIT (&conf->lock);
+        LOCK_INIT (&conf->mdc_counter.lock);
         time (&conf->last_child_down);

Comment 1 Worker Ant 2017-02-16 10:11:42 UTC
REVIEW: https://review.gluster.org/16640 (md-cache: initialize mdc_counter.lock) posted (#1) for review on release-3.10 by Niels de Vos (ndevos)

Comment 2 Worker Ant 2017-02-16 17:53:29 UTC
COMMIT: https://review.gluster.org/16640 committed in release-3.10 by Shyamsundar Ranganathan (srangana) 
------
commit baaea76bd93efe2cfdda52ab0b603fec713df455
Author: Niels de Vos <ndevos>
Date:   Thu Feb 16 11:09:05 2017 +0100

    md-cache: initialize mdc_counter.lock
    
    add missed LOCK_INIT to fix INCREMENT_ATOMIC on
    conf->mdc_counter.lock when pthread_spin_* using
    
    Cherry picked from commit 22f02d8f1dcdf176744ab1536cb23a5fcd291243:
    > Change-Id: I680bd6f41e3b8a1852ed969bf6794cbf4c1ccdd4
    > BUG: 1417913
    > Signed-off-by: Vitaly Lipatov <lav>
    > Reviewed-on: https://review.gluster.org/16515
    > Reviewed-by: Niels de Vos <ndevos>
    > Tested-by: Niels de Vos <ndevos>
    > Smoke: Gluster Build System <jenkins.org>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Poornima G <pgurusid>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: Vijay Bellur <vbellur>
    
    Change-Id: I680bd6f41e3b8a1852ed969bf6794cbf4c1ccdd4
    BUG: 1417915
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: https://review.gluster.org/16640
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Poornima G <pgurusid>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>

Comment 3 Niels de Vos 2017-02-21 10:09:15 UTC
Moving back to MODIFIED, there is no tag (and "fixed in version") that contains the change.

Comment 4 Shyamsundar 2017-02-27 15:31:04 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-devel/2017-February/052173.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 5 Shyamsundar 2017-03-06 17:44:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.