1364026 – glfs_fini() crashes with SIGSEGV

Bug 1364026 - glfs_fini() crashes with SIGSEGV

Summary: glfs_fini() crashes with SIGSEGV

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	libgfapi
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	All
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Soumya Koduri
QA Contact:	Sudhir D
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1362540
TreeView+	depends on / blocked

Reported:	2016-08-04 10:15 UTC by Soumya Koduri
Modified:	2017-03-27 18:18 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.9.0
Clone Of:	1362540
Environment:
Last Closed:	2017-03-27 18:18:24 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Soumya Koduri 2016-08-04 10:15:25 UTC

+++ This bug was initially created as a clone of Bug #1362540 +++

Description of problem:
I was trying to benchmark libgfapi-python bindings for filesystem walk (metadata intensive) workload to do comparison against FUSE for same workload. The program crashes during virtual unmount (fini).

Setup details:

[root@f24 ~]# rpm -qa | grep gluster
glusterfs-3.8.1-1.fc24.x86_64
python-gluster-3.8.1-1.fc24.noarch
glusterfs-libs-3.8.1-1.fc24.x86_64
glusterfs-client-xlators-3.8.1-1.fc24.x86_64
glusterfs-cli-3.8.1-1.fc24.x86_64
glusterfs-fuse-3.8.1-1.fc24.x86_64
glusterfs-server-3.8.1-1.fc24.x86_64
glusterfs-api-3.8.1-1.fc24.x86_64
glusterfs-debuginfo-3.8.1-1.fc24.x86_64

[root@f24 ~]# uname -a
Linux f24 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@f24]# gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: e675d53a-c9b6-468f-bb8f-7101828bec70
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: f24:/export/brick1/data
Brick2: f24:/export/brick2/data
Brick3: f24:/export/brick3/data
Brick4: f24:/export/brick4/data
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet

NOTE: The volume is created with 4 RAM disks (1GB each) as bricks.

[root@f24 ~]# df -h | grep 'ram\|Filesystem'
Filesystem               Size  Used Avail Use% Mounted on
/dev/ram1                971M  272M  700M  28% /export/brick1
/dev/ram2                971M  272M  700M  28% /export/brick2
/dev/ram3                971M  272M  700M  28% /export/brick3
/dev/ram4                971M  272M  700M  28% /export/brick4

[root@f24 ~]# df -ih | grep 'ram\|Filesystem'
Filesystem              Inodes IUsed IFree IUse% Mounted on
/dev/ram1                 489K  349K  140K   72% /export/brick1
/dev/ram2                 489K  349K  140K   72% /export/brick2
/dev/ram3                 489K  349K  140K   72% /export/brick3
/dev/ram4                 489K  349K  140K   72% /export/brick4


How reproducible:
Always and consistently (at least on my fedora 24 test VM)

Steps to Reproduce:
1. Create large nested directory tree using FUSE mount.
2. Unmount FUSE mount. This is just to generate initial data.
3. Use python-libgfapi bindings with patch (http://review.gluster.org/#/c/14770/). It's reproducible without this patch too but the patch makes the crawling faster.
4. Run the python script that does walk using libgfapi

[root@f24 ~]# ./reproducer.py 
Segmentation fault (core dumped)

There *could* be a simple reproducer too but I haven't had the time to look further into it.

Excerpt from bt:

#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
41		new->prev = head->prev;
[Current thread is 1 (Thread 0x7fcae3dbb700 (LWP 4165))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.23.1-8.fc24.x86_64 keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.1-8.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libcom_err-1.42.13-4.fc24.x86_64 libffi-3.1-9.fc24.x86_64 libselinux-2.5-9.fc24.x86_64 libuuid-2.28-3.fc24.x86_64 openssl-libs-1.0.2h-1.fc24.x86_64 pcre-8.39-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) bt
#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
#1  list_move_tail (head=0x90, list=0x7fcac00040b8) at list.h:107
#2  __inode_retire (inode=0x7fcac0004030) at inode.c:439
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
#4  0x00007fcad7642f22 in inode_table_destroy (inode_table=0x7fcac0004040) at inode.c:1808
#5  0x00007fcad7642fee in inode_table_destroy_all (ctx=ctx@entry=0x55f82360b430) at inode.c:1733
#6  0x00007fcad7f3fde6 in pub_glfs_fini (fs=0x55f823537950) at glfs.c:1204


Expected results:
No crash

Additional_info:

[root@f24 ~]# rpm -qa | grep glusterfs-debuginfo
glusterfs-debuginfo-3.8.1-1.fc24.x86_64
[root@f24 ~]# rpm -qa | grep python-debuginfo
python-debuginfo-2.7.12-1.fc24.x86_64

[root@f24 coredump]# ls -lh
total 311M
-rw-r--r--. 1 root root 350M Aug  2 17:31 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000
-rw-r-----. 1 root root  22M Aug  2 17:29 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000.lz4

Can provide the coredump offline.

--- Additional comment from Prashanth Pai on 2016-08-02 09:24 EDT ---



--- Additional comment from Prashanth Pai on 2016-08-02 09:35:30 EDT ---

Also, I can't seem to reproduce this when the test filesystem tree is small enough.

--- Additional comment from Soumya Koduri on 2016-08-04 06:14:55 EDT ---

I suspect below could have caused the issue - 

In inode_table_destroy(), we first purge all the lru entries but the lru count is not adjusted accordingly. So when inode_table_prune() is called in case if the lru count was larger than lru limit (as can be seen in the core), we shall end up accessing invalid memory. 

(gdb) f 3
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
1521	                        __inode_retire (entry);
(gdb) p table->lru_size
$4 = 132396
(gdb) p table->lru_limit
$5 = 131072
(gdb) p table->lru
$6 = {next = 0x90, prev = 0xcafecafe}
(gdb) p &&table->lru
A syntax error in expression, near `&&table->lru'.
(gdb) p &table->lru
$7 = (struct list_head *) 0x7fcac00040b8
(gdb) 

I will send a fix for it.

Comment 1 Vijay Bellur 2016-08-04 11:31:06 UTC

REVIEW: http://review.gluster.org/15087 (inode: Adjust lru_size while retiring entries in lru list) posted (#1) for review on master by soumya k (skoduri)

Comment 2 Soumya Koduri 2016-08-04 11:31:57 UTC

Prashanth,

Could you please test the fix posted. Thanks!

Comment 3 Prashanth Pai 2016-08-08 07:07:48 UTC

The fix works and issue is no longer reproducible. If possible, please backport this to 3.7 branch as well. Thanks for the quick fix.

Comment 4 Vijay Bellur 2016-08-09 08:15:30 UTC

COMMIT: http://review.gluster.org/15087 committed in master by Raghavendra G (rgowdapp) 
------
commit 567304eaa24c0715f8f5d8ca5c70ac9e2af3d2c8
Author: Soumya Koduri <skoduri>
Date:   Thu Aug 4 16:00:31 2016 +0530

    inode: Adjust lru_size while retiring entries in lru list
    
    As part of inode_table_destroy(), we first retire entries
    in the lru list but the lru_size is not adjusted accordingly.
    This may result in invalid memory reference in inode_table_prune
    if the lru_size > lru_limit.
    
    Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963
    BUG: 1364026
    Signed-off-by: Soumya Koduri <skoduri>
    Reviewed-on: http://review.gluster.org/15087
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Prashanth Pai <ppai>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 5 Shyamsundar 2017-03-27 18:18:24 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report.

glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.