+++ This bug was initially created as a clone of Bug #1362540 +++ Description of problem: I was trying to benchmark libgfapi-python bindings for filesystem walk (metadata intensive) workload to do comparison against FUSE for same workload. The program crashes during virtual unmount (fini). Setup details: [root@f24 ~]# rpm -qa | grep gluster glusterfs-3.8.1-1.fc24.x86_64 python-gluster-3.8.1-1.fc24.noarch glusterfs-libs-3.8.1-1.fc24.x86_64 glusterfs-client-xlators-3.8.1-1.fc24.x86_64 glusterfs-cli-3.8.1-1.fc24.x86_64 glusterfs-fuse-3.8.1-1.fc24.x86_64 glusterfs-server-3.8.1-1.fc24.x86_64 glusterfs-api-3.8.1-1.fc24.x86_64 glusterfs-debuginfo-3.8.1-1.fc24.x86_64 [root@f24 ~]# uname -a Linux f24 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux [root@f24]# gluster volume info Volume Name: test Type: Distributed-Replicate Volume ID: e675d53a-c9b6-468f-bb8f-7101828bec70 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: f24:/export/brick1/data Brick2: f24:/export/brick2/data Brick3: f24:/export/brick3/data Brick4: f24:/export/brick4/data Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet NOTE: The volume is created with 4 RAM disks (1GB each) as bricks. [root@f24 ~]# df -h | grep 'ram\|Filesystem' Filesystem Size Used Avail Use% Mounted on /dev/ram1 971M 272M 700M 28% /export/brick1 /dev/ram2 971M 272M 700M 28% /export/brick2 /dev/ram3 971M 272M 700M 28% /export/brick3 /dev/ram4 971M 272M 700M 28% /export/brick4 [root@f24 ~]# df -ih | grep 'ram\|Filesystem' Filesystem Inodes IUsed IFree IUse% Mounted on /dev/ram1 489K 349K 140K 72% /export/brick1 /dev/ram2 489K 349K 140K 72% /export/brick2 /dev/ram3 489K 349K 140K 72% /export/brick3 /dev/ram4 489K 349K 140K 72% /export/brick4 How reproducible: Always and consistently (at least on my fedora 24 test VM) Steps to Reproduce: 1. Create large nested directory tree using FUSE mount. 2. Unmount FUSE mount. This is just to generate initial data. 3. Use python-libgfapi bindings with patch (http://review.gluster.org/#/c/14770/). It's reproducible without this patch too but the patch makes the crawling faster. 4. Run the python script that does walk using libgfapi [root@f24 ~]# ./reproducer.py Segmentation fault (core dumped) There *could* be a simple reproducer too but I haven't had the time to look further into it. Excerpt from bt: #0 list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41 41 new->prev = head->prev; [Current thread is 1 (Thread 0x7fcae3dbb700 (LWP 4165))] Missing separate debuginfos, use: dnf debuginfo-install glibc-2.23.1-8.fc24.x86_64 keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.1-8.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libcom_err-1.42.13-4.fc24.x86_64 libffi-3.1-9.fc24.x86_64 libselinux-2.5-9.fc24.x86_64 libuuid-2.28-3.fc24.x86_64 openssl-libs-1.0.2h-1.fc24.x86_64 pcre-8.39-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64 (gdb) bt #0 list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41 #1 list_move_tail (head=0x90, list=0x7fcac00040b8) at list.h:107 #2 __inode_retire (inode=0x7fcac0004030) at inode.c:439 #3 0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521 #4 0x00007fcad7642f22 in inode_table_destroy (inode_table=0x7fcac0004040) at inode.c:1808 #5 0x00007fcad7642fee in inode_table_destroy_all (ctx=ctx@entry=0x55f82360b430) at inode.c:1733 #6 0x00007fcad7f3fde6 in pub_glfs_fini (fs=0x55f823537950) at glfs.c:1204 Expected results: No crash Additional_info: [root@f24 ~]# rpm -qa | grep glusterfs-debuginfo glusterfs-debuginfo-3.8.1-1.fc24.x86_64 [root@f24 ~]# rpm -qa | grep python-debuginfo python-debuginfo-2.7.12-1.fc24.x86_64 [root@f24 coredump]# ls -lh total 311M -rw-r--r--. 1 root root 350M Aug 2 17:31 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000 -rw-r-----. 1 root root 22M Aug 2 17:29 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000.lz4 Can provide the coredump offline. --- Additional comment from Prashanth Pai on 2016-08-02 09:24 EDT --- --- Additional comment from Prashanth Pai on 2016-08-02 09:35:30 EDT --- Also, I can't seem to reproduce this when the test filesystem tree is small enough. --- Additional comment from Soumya Koduri on 2016-08-04 06:14:55 EDT --- I suspect below could have caused the issue - In inode_table_destroy(), we first purge all the lru entries but the lru count is not adjusted accordingly. So when inode_table_prune() is called in case if the lru count was larger than lru limit (as can be seen in the core), we shall end up accessing invalid memory. (gdb) f 3 #3 0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521 1521 __inode_retire (entry); (gdb) p table->lru_size $4 = 132396 (gdb) p table->lru_limit $5 = 131072 (gdb) p table->lru $6 = {next = 0x90, prev = 0xcafecafe} (gdb) p &&table->lru A syntax error in expression, near `&&table->lru'. (gdb) p &table->lru $7 = (struct list_head *) 0x7fcac00040b8 (gdb) I will send a fix for it.
REVIEW: http://review.gluster.org/15087 (inode: Adjust lru_size while retiring entries in lru list) posted (#1) for review on master by soumya k (skoduri)
Prashanth, Could you please test the fix posted. Thanks!
The fix works and issue is no longer reproducible. If possible, please backport this to 3.7 branch as well. Thanks for the quick fix.
COMMIT: http://review.gluster.org/15087 committed in master by Raghavendra G (rgowdapp) ------ commit 567304eaa24c0715f8f5d8ca5c70ac9e2af3d2c8 Author: Soumya Koduri <skoduri> Date: Thu Aug 4 16:00:31 2016 +0530 inode: Adjust lru_size while retiring entries in lru list As part of inode_table_destroy(), we first retire entries in the lru list but the lru_size is not adjusted accordingly. This may result in invalid memory reference in inode_table_prune if the lru_size > lru_limit. Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963 BUG: 1364026 Signed-off-by: Soumya Koduri <skoduri> Reviewed-on: http://review.gluster.org/15087 Smoke: Gluster Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-by: Prashanth Pai <ppai> CentOS-regression: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report. glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html [2] https://www.gluster.org/pipermail/gluster-users/