Bug 765550 (GLUSTER-3818) - Attempted access to split-brain file causes segfault in pthread_spin_lock().
Summary: Attempted access to split-brain file causes segfault in pthread_spin_lock().
Keywords:
Status: CLOSED DEFERRED
Alias: GLUSTER-3818
Product: GlusterFS
Classification: Community
Component: replicate
Version: 3.2.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Pranith Kumar K
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-20 00:45 UTC by Jeff Byers
Modified: 2014-12-14 19:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-14 19:40:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Byers 2011-11-20 00:45:47 UTC
attempted access to split-brain file causes glusterfs segfault in pthread_spin_lock().

Attempted access to a split-brain file when both glusterfs
nodes are running in a two-brick replica volume results in a
segmentation fault in pthread_spin_lock() because fuse-
bridge.c:1785 passes in an uninitialized or deinitialized
'fd'.

1) Using glusterfs 3.2.3, with a simple two-way replica
volume:

    [root@SC-10-10-63-55]# gluster volume info nas-volume-0004

    Volume Name: nas-volume-0004
    Type: Replicate
    Status: Started
    Number of Bricks: 2
    Transport-type: tcp
    Bricks:
    Brick1: 10.10.60.57:/exports/nas-segment-0002/nas-volume-0004
    Brick2: 10.10.60.55:/exports/nas-segment-0001/nas-volume-0004
    Options Reconfigured:
    nfs.rpc-auth-allow: *
    nfs.addr-namelookup: off
    nfs.disable: on

2) Everything is duplex:

    [root@SC-10-10-63-55]# gluster peer status
    Number of Peers: 1

    Hostname: 10.10.60.57
    Uuid: c0efccec-b0a0-a091-e517-003048335418
    State: Peer in Cluster (Connected)

3) On the Samba client, write to the file:

    c:\test>time /t > x:\time.out
    c:\test>time /t >> x:\time.out

    c:\test>type x:\time.out
    08:43 AM
    08:43 AM

4) Now shut down brick 10.10.60.57.

    [root@SC-10-10-63-55]# gluster peer status
    Number of Peers: 1

    Hostname: 10.10.60.57
    Uuid: c0efccec-b0a0-a091-e517-003048335418
    State: Peer in Cluster (Disconnected)

5) Modify file with 10.10.60.57 down:

    c:\test>time /t >> x:\time.out

    c:\test>type x:\time.out
    08:43 AM
    08:43 AM
    09:07 AM

6) We want to cause a split-brain file, so shut down
10.10.60.55, then bring up 10.10.60.57.

7) Modify file with 10.10.60.55 down:

    c:\test>time /t >> x:\time.out

    c:\test>type x:\time.out
    08:43 AM
    08:43 AM
    09:44 AM

8) Now bring 10.10.60.55 up so we are duplex again.

    [root@SC-10-10-63-55]# getfattr --absolute-names -m '.' -d -e hex -R $'/exports/nas-segment-0001/nas-volume-0004/time.out'
    # file: /exports/nas-segment-0001/nas-volume-0004/time.out
    trusted.afr.nas-volume-0004-client-0=0x000000010000000100000000
    trusted.afr.nas-volume-0004-client-1=0x000000000000000000000000
    trusted.gfid=0xc80e81e4961c4b59b7fd54ca4eea1dcd

    [root@SC-10-10-63-57]# getfattr --absolute-names -m '.' -d -e hex -R $'/exports/nas-segment-0002/nas-volume-0004/time.out'
    # file: /exports/nas-segment-0002/nas-volume-0004/time.out
    trusted.afr.nas-volume-0004-client-0=0x000000000000000000000000
    trusted.afr.nas-volume-0004-client-1=0x000000010000000100000000
    trusted.gfid=0xc80e81e4961c4b59b7fd54ca4eea1dcd

9) Now read the split-brain file on the client. This causes
a glusterfs segmentation fault:

[root@SC-10-10-63-55]# gdb /usr/local/sbin/glusterfs  /core.2897
Core was generated by `/usr/local/sbin/glusterfs --log-level=INFO --volfile=/etc/glusterd/vols/nas-vol'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000300100b722 in pthread_spin_lock () from /lib64/libpthread.so.0
(gdb) bt
#0  0x000000300100b722 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00002af1d11de443 in fd_ref (fd=0x2aaaaca23024) at fd.c:378
#2  0x00002af1d2652925 in fuse_flush (this=0xd655f00, finh=0x2aaab00008c0, msg=0x2aaab00008e8) at fuse-bridge.c:1924
#3  0x00002af1d2657b9b in fuse_thread_proc (data=<value optimized out>) at fuse-bridge.c:3220
#4  0x000000300100673d in start_thread () from /lib64/libpthread.so.0
#5  0x00000030004d44bd in clone () from /lib64/libc.so.6
(gdb) f 1
#1  0x00002af1d11de443 in fd_ref (fd=0x2aaaaca23024) at fd.c:378
378     fd.c: No such file or directory.
        in fd.c
(gdb) p *fd
$1 = {pid = 16833, flags = 32770, refcount = 0, inode_list = {next = 0x2aaaaca23034, prev = 0x2aaaaca23034}, inode = 0xaaaaaaaa, lock = 1,
  _ctx = 0x2aaab0000c70, xl_count = 10}
(gdb) info locals
refed_fd = <value optimized out>
__FUNCTION__ = "fd_ref"

[2011-11-19 10:04:14.378144] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:04:14.402170] I [afr-common.c:811:afr_lookup_done] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal triggered. path: /time.out
[2011-11-19 10:04:14.410382] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:04:14.410409] E [afr-self-heal-metadata.c:518:afr_sh_metadata_fix] 0-nas-volume-0004-replicate-0: Unable to self-heal permissions/ownership of '/time.out' (possible split-brain). Please fix the file on all backend volumes
[2011-11-19 10:04:14.410526] I [afr-self-heal-metadata.c:81:afr_sh_metadata_done] 0-nas-volume-0004-replicate-0: split-brain detected, aborting selfheal of /time.out
[2011-11-19 10:04:14.410545] E [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal failed on /time.out
[2011-11-19 10:04:34.842564] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:04:34.853140] W [afr-common.c:776:afr_lookup_done] 0-nas-volume-0004-replicate-0: split brain detected during lookup of /time.out.
[2011-11-19 10:04:34.853170] I [afr-common.c:811:afr_lookup_done] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal triggered. path: /time.out
[2011-11-19 10:04:34.853539] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:04:34.853559] E [afr-self-heal-metadata.c:518:afr_sh_metadata_fix] 0-nas-volume-0004-replicate-0: Unable to self-heal permissions/ownership of '/time.out' (possible split-brain). Please fix the file on all backend volumes
[2011-11-19 10:04:34.853683] I [afr-self-heal-metadata.c:81:afr_sh_metadata_done] 0-nas-volume-0004-replicate-0: split-brain detected, aborting selfheal of /time.out
[2011-11-19 10:04:34.853700] E [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal failed on /time.out
[2011-11-19 10:04:34.853919] W [afr-open.c:168:afr_open] 0-nas-volume-0004-replicate-0: failed to open as split brain seen, returning EIO
[2011-11-19 10:04:34.853948] W [quick-read.c:3029:qr_lk_helper] 0-nas-volume-0004-quick-read: open failed on path (/time.out) (Input/output error), unwinding lk call
[2011-11-19 10:04:34.853965] W [fuse-bridge.c:2818:fuse_setlk_cbk] 0-glusterfs-fuse: 134: ERR => -1 (Input/output error)
[2011-11-19 10:04:34.855112] W [afr-open.c:168:afr_open] 0-nas-volume-0004-replicate-0: failed to open as split brain seen, returning EIO
[2011-11-19 10:04:34.855133] W [fuse-bridge.c:1751:fuse_readv_cbk] 0-glusterfs-fuse: 135: READ => -1 (Input/output error)
[2011-11-19 10:04:34.855152] E [mem-pool.c:468:mem_put] 0-mem-pool: invalid argument
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2011-11-19 10:04:34
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.2.3
/lib64/libc.so.6[0x30004302d0]
/lib64/libpthread.so.0(pthread_spin_lock+0x2)[0x300100b722]
/usr/local/lib/libglusterfs.so.0(fd_ref+0x23)[0x2af1d11de443]
/usr/local/lib/glusterfs/3.2.3/xlator/mount/fuse.so[0x2af1d2652925]
/usr/local/lib/glusterfs/3.2.3/xlator/mount/fuse.so[0x2af1d2657b9b]
/lib64/libpthread.so.0[0x300100673d]
/lib64/libc.so.6(clone+0x6d)[0x30004d44bd]

10) While the .55 glusterfs is crashed, the client is
allowed to access the file, but with possibly the wrong
content.

11) Get .55 glusterfs running again.

12) Now access the file directly from the .57 glusterfs
node, and get the same crash:

[root@SC-10-10-63-57]# cat /samba/nas-volume-0004/time.out
08:43 AM
08:43 AM
09:44 AM
[root@SC-10-10-63-57]# cat /samba/nas-volume-0004/time.out
cat: /samba/nas-volume-0004/time.out: Software caused connection abort
cat: /samba/nas-volume-0004/time.out: Transport endpoint is not connected

[root@SC-10-10-63-57]# gdb /usr/local/sbin/glusterfs /core.11316
Core was generated by `/usr/local/sbin/glusterfs --log-level=INFO --volfile=/etc/glusterd/vols/nas-vol'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000300100b722 in pthread_spin_lock () from /lib64/libpthread.so.0
(gdb) bt
#0  0x000000300100b722 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00002b05c5518443 in fd_ref (fd=0x2aaaaca23024) at fd.c:378
#2  0x00002b05c698cc45 in fuse_readv (this=0x1be1df00, finh=0x2aaab0000900, msg=0x2aaab0000928) at fuse-bridge.c:1785
#3  0x00002b05c6991b9b in fuse_thread_proc (data=<value optimized out>) at fuse-bridge.c:3220
#4  0x000000300100673d in start_thread () from /lib64/libpthread.so.0
#5  0x00000030004d44bd in clone () from /lib64/libc.so.6
(gdb) f 1
#1  0x00002b05c5518443 in fd_ref (fd=0x2aaaaca23024) at fd.c:378
378     fd.c: No such file or directory.
        in fd.c
(gdb) p *fd
$1 = {pid = 25947, flags = 32768, refcount = 0, inode_list = {next = 0x2aaaaca23034, prev = 0x2aaaaca23034}, inode = 0xaaaaaaaa, lock = 1,
  _ctx = 0x2aaab0000900, xl_count = 10}

[2011-11-19 10:14:23.416968] I [client-handshake.c:913:client_setvolume_cbk] 0-nas-volume-0004-client-1: Connected to 10.10.60.55:24013, attached to remote volume '/exports/nas-segment-0001/nas-volume-0004'.
[2011-11-19 10:27:38.73928] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:27:38.80971] I [afr-common.c:811:afr_lookup_done] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal triggered. path: /time.out
[2011-11-19 10:27:38.81358] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:27:38.81399] E [afr-self-heal-metadata.c:518:afr_sh_metadata_fix] 0-nas-volume-0004-replicate-0: Unable to self-heal permissions/ownership of '/time.out' (possible split-brain). Please fix the file on all backend volumes
[2011-11-19 10:27:38.81552] I [afr-self-heal-metadata.c:81:afr_sh_metadata_done] 0-nas-volume-0004-replicate-0: split-brain detected, aborting selfheal of /time.out
[2011-11-19 10:27:38.81569] E [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal failed on /time.out
[2011-11-19 10:27:40.930273] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:27:40.930302] W [afr-common.c:776:afr_lookup_done] 0-nas-volume-0004-replicate-0: split brain detected during lookup of /time.out.
[2011-11-19 10:27:40.930315] I [afr-common.c:811:afr_lookup_done] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal triggered. path: /time.out
[2011-11-19 10:27:40.930669] I [afr-self-heal-common.c:537:afr_sh_mark_sources] 0-nas-volume-0004-replicate-0: split-brain possible, no source detected
[2011-11-19 10:27:40.930683] E [afr-self-heal-metadata.c:518:afr_sh_metadata_fix] 0-nas-volume-0004-replicate-0: Unable to self-heal permissions/ownership of '/time.out' (possible split-brain). Please fix the file on all backend volumes
[2011-11-19 10:27:40.930812] I [afr-self-heal-metadata.c:81:afr_sh_metadata_done] 0-nas-volume-0004-replicate-0: split-brain detected, aborting selfheal of /time.out
[2011-11-19 10:27:40.930828] E [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk] 0-nas-volume-0004-replicate-0: background  meta-data data self-heal failed on /time.out
[2011-11-19 10:27:40.930966] W [afr-open.c:168:afr_open] 0-nas-volume-0004-replicate-0: failed to open as split brain seen, returning EIO
[2011-11-19 10:27:40.930997] W [fuse-bridge.c:1751:fuse_readv_cbk] 0-glusterfs-fuse: 404: READ => -1 (Input/output error)
[2011-11-19 10:27:40.931020] E [mem-pool.c:468:mem_put] 0-mem-pool: invalid argument
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2011-11-19 10:27:40
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.2.3

=====

Summary: My understanding is that a client's attempt to
access a glusterfs split-brain file should not cause it to
crash. The client should not be able to access the split-brain
file since it may not be the right version, but once one of
the glusterfs servers is crashed, access is allowed.

Comment 1 Jeff Byers 2012-02-24 14:32:23 UTC
What info do you need?

Comment 2 Niels de Vos 2014-11-27 14:54:43 UTC
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug.

If there has been no update before 9 December 2014, this bug will get automatocally closed.


Note You need to log in before you can comment on or make changes to this bug.