Bug 1659563 - gluster-blockd segfaults because of a null-dereference in shard.so
Summary: gluster-blockd segfaults because of a null-dereference in shard.so
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: sharding
Version: 5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
Assignee: Niels de Vos
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On:
Blocks: glusterfs-5.3
TreeView+ depends on / blocked
 
Reported: 2018-12-14 16:47 UTC by Niels de Vos
Modified: 2019-01-22 14:08 UTC (History)
4 users (show)

Fixed In Version: glusterfs-5.3
Clone Of:
Environment:
Last Closed: 2019-01-22 14:08:49 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 21866 0 None Open shard: prevent segfault in shard_unlink_block_inode() 2018-12-26 16:38:20 UTC

Description Niels de Vos 2018-12-14 16:47:30 UTC
Description of problem:
Heketi tests have started to fail with the Gluster 5 release. It seems that gluster-blockd occasionally gets a segfault and will not handle further requests anymore.

Version-Release number of selected component (if applicable):
glusterfs-5.1-1.el7.x86_64

How reproducible:
random, but very often

Steps to Reproduce:
1. Run the functional tests that are part of heketi
2. git clone github.com/heketi/heketi
3. cd heketi
4. make test-functional

Actual results:
tests fail, logs contain references that communicating with gluster-blockd failed.

Expected results:
Tests should pass

Additional info:


[root@storage2 ~]# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Fri 2018-12-14 15:42:16 UTC; 7min ago 
  Process: 7246 ExecStart=/usr/sbin/gluster-blockd --glfs-lru-count $GB_GLFS_LRU_COUNT --log-level $GB_LOG_LEVEL $GB_EXTRA_ARGS (code=killed, signal=SEGV)
 Main PID: 7246 (code=killed, signal=SEGV)

Dec 14 15:41:40 storage2 systemd[1]: Started Gluster block storage utility.
Dec 14 15:41:41 storage2 gluster-blockd[7246]: Parameter logfile is now '/var/log/gluster-block/gluster-block-configshell.log'.
Dec 14 15:41:41 storage2 gluster-blockd[7246]: Parameter loglevel_file is now 'info'.
Dec 14 15:41:41 storage2 gluster-blockd[7246]: Parameter auto_enable_tpgt is now 'false'.
Dec 14 15:41:41 storage2 gluster-blockd[7246]: Parameter auto_add_default_portal is now 'false'.
Dec 14 15:41:41 storage2 gluster-blockd[7246]: Configuration saved to /etc/target/saveconfig.json
Dec 14 15:42:16 storage2 systemd[1]: gluster-blockd.service: main process exited, code=killed, status=11/SEGV
Dec 14 15:42:16 storage2 systemd[1]: Unit gluster-blockd.service entered failed state.
Dec 14 15:42:16 storage2 systemd[1]: gluster-blockd.service failed.

[root@storage2 ~]# dmesg | grep segf
[  143.199235] glfs_epoll000[7847]: segfault at f0 ip 00007fe5b3ddc9b9 sp 00007fe5beaa6440 error 6 in shard.so[7fe5b3dd3000+2b000]



Core was generated by `/usr/sbin/gluster-blockd --glfs-lru-count 5 --log-level INFO'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fbb9cd639b9 in shard_unlink_block_inode (local=local@entry=0x7fbb80000a78, shard_block_num=<optimized out>) at shard.c:2929
2929                base_ictx->fsync_count--;
(gdb) l
2924            if (ctx->fsync_needed) {
2925                unref_base_inode++;
2926                list_del_init(&ctx->to_fsync_list);
2927                if (base_inode)
2928                    __shard_inode_ctx_get(base_inode, this, &base_ictx);
2929                base_ictx->fsync_count--;
2930            }       
2931        }       
2932        UNLOCK(&inode->lock);
2933        if (base_inode)
(gdb) p *base_ictx 
Cannot access memory at address 0x0


The problem has been introduced by commit https://github.com/gluster/glusterfs/commit/02a05da6989f and was fixed only in the master branch with https://github.com/gluster/glusterfs/commit/145e1805 . The 2nd commit will need to be backported to the release-5 branch of glusterfs.

Comment 1 Worker Ant 2018-12-14 16:54:47 UTC
REVIEW: https://review.gluster.org/21866 (shard: prevent segfault in shard_unlink_block_inode()) posted (#1) for review on release-5 by Niels de Vos

Comment 2 Worker Ant 2018-12-26 16:38:19 UTC
REVIEW: https://review.gluster.org/21866 (shard: prevent segfault in shard_unlink_block_inode()) posted (#2) for review on release-5 by Shyamsundar Ranganathan

Comment 3 Shyamsundar 2019-01-09 15:19:59 UTC
(In reply to Worker Ant from comment #2)
> REVIEW: https://review.gluster.org/21866 (shard: prevent segfault in
> shard_unlink_block_inode()) posted (#2) for review on release-5 by
> Shyamsundar Ranganathan

The above patch uses the "Updates" keyword, but there are no pending patches, so is the tag in the commit message correct? or are we expecting more patches around this?

Comment 4 Niels de Vos 2019-01-09 15:39:04 UTC
(In reply to Shyamsundar from comment #3)
> (In reply to Worker Ant from comment #2)
> > REVIEW: https://review.gluster.org/21866 (shard: prevent segfault in
> > shard_unlink_block_inode()) posted (#2) for review on release-5 by
> > Shyamsundar Ranganathan
> 
> The above patch uses the "Updates" keyword, but there are no pending
> patches, so is the tag in the commit message correct? or are we expecting
> more patches around this?

This is the only patch that I expect is needed. If you prefer Closes: or Fixes: as a tag, feel free to change the commit message :)

Comment 5 Shyamsundar 2019-01-22 14:08:49 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.3, please open a new bug report.

glusterfs-5.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-January/000118.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.