+++ This bug was initially created as a clone of Bug #1753592 +++ +++ This bug was initially created as a clone of Bug #1749369 +++ Description of problem: Segmentation fault occurs when bricks are nearly full 100% and in parallel truncate of a file is attempted (No space left on device). Prerequisite is that performance xlators are activated (read-ahead, write-behind etc) like following [root@ glusterfs]# gluster --glusterd-sock=/d/backends/1/glusterd/gd.sock volume info Volume Name: patchy Type: Replicate Volume ID: 5b79a54d-8207-4d38-a0e2-001adbdd11d8 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 127.1.1.1:/d/backends/1/brick Brick2: 127.1.1.2:/d/backends/2/brick Brick3: 127.1.1.3:/d/backends/3/brick Options Reconfigured: transport.address-family: inet nfs.disable: on performance.client-io-threads: off back trace is like follwoing #0 mem_put (ptr=0x7fa8e006ec40) at mem-pool.c:847 #1 0x00007fa8e94d3fea in FRAME_DESTROY (frame=0x7fa8d0015698) at ../../../../libglusterfs/src/stack.h:173 #2 STACK_DESTROY (stack=0x7fa8d000b238) at ../../../../libglusterfs/src/stack.h:193 #3 fuse_truncate_cbk (frame=<optimized out>, cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, op_errno=28, prebuf=<optimized out>, postbuf=0x7fa8e6637530, xdata=0x0) at fuse-bridge.c:813 #4 0x00007fa8e45b5f7d in io_stats_ftruncate_cbk (frame=0x7fa8d0009508, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at io-stats.c:2425 #5 0x00007fa8e47d6feb in mdc_ftruncate_cbk (frame=0x7fa8d0015e58, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at md-cache.c:1477 #6 0x00007fa8ec29e668 in default_ftruncate_cbk (frame=0x7fa8d000a558, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at defaults.c:1375 #7 0x00007fa8e4bfc539 in qr_ftruncate_cbk (frame=0x7fa8d000d288, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8e6637530, xdata=0x0) at quick-read.c:890 #8 0x00007fa8e4e06c48 in ioc_ftruncate_cbk (frame=frame@entry=0x7fa8d000f1e8, cookie=<optimized out>, this=<optimized out>, op_ret=op_ret@entry=-1, op_errno=op_errno@entry=28, prebuf=prebuf@entry=0x7fa8d001cac0, postbuf=postbuf@entry=0x7fa8e6637530, xdata=xdata@entry=0x0) at io-cache.c:1277 #9 0x00007fa8e5022f2a in rda_ftruncate_cbk (frame=0x7fa8d000f088, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8d001cb58, xdata=0x0) at readdir-ahead.c:826 #10 0x00007fa8e522df58 in ra_truncate_cbk (frame=0x7fa8d0015588, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=0x7fa8d001cac0, postbuf=0x7fa8d001cb58, xdata=0x0) at read-ahead.c:668 #11 0x00007fa8ec22bd43 in call_resume_unwind (stub=stub@entry=0x7fa8d001bf88) at call-stub.c:2348 #12 0x00007fa8ec23a989 in call_unwind_error_keep_stub (stub=0x7fa8d001bf88, op_ret=op_ret@entry=-1, op_errno=28) at call-stub.c:2541 #13 0x00007fa8e544482c in wb_do_winds (wb_inode=wb_inode@entry=0x7fa8e006ec40, tasks=tasks@entry=0x7fa8e6637750) at write-behind.c:1735 #14 0x00007fa8e54448db in wb_process_queue (wb_inode=wb_inode@entry=0x7fa8e006ec40) at write-behind.c:1778 #15 0x00007fa8e5444a5f in wb_fulfill_cbk (frame=frame@entry=0x7fa8e005e808, cookie=<optimized out>, this=<optimized out>, op_ret=op_ret@entry=-1, op_errno=op_errno@entry=28, prebuf=prebuf@entry=0x7fa8e00731e0, postbuf=postbuf@entry=0x7fa8e0073278, xdata=xdata@entry=0x0) at write-behind.c:1105 #16 0x00007fa8e56c5f60 in dht_writev_cbk (frame=frame@entry=0x7fa8e006c0a8, cookie=<optimized out>, this=<optimized out>, op_ret=-1, op_errno=28, prebuf=prebuf@entry=0x7fa8e00731e0, postbuf=postbuf@entry=0x7fa8e0073278, xdata=0x0) at dht-inode-write.c:140 #17 0x00007fa8e592f31e in afr_writev_unwind (frame=frame@entry=0x7fa8e0000b18, this=this@entry=0x7fa8e000f820) at afr-inode-write.c:234 #18 0x00007fa8e592f4fd in afr_transaction_writev_unwind (frame=0x7fa8e005ed78, this=0x7fa8e000f820) at afr-inode-write.c:248 #19 0x00007fa8e5941be2 in afr_transaction_done (frame=0x7fa8e005ed78, this=0x7fa8e000f820) at afr-transaction.c:367 #20 0x00007fa8e59453d3 in afr_unlock_common_cbk (frame=frame@entry=0x7fa8e005ed78, this=this@entry=0x7fa8e000f820, op_ret=op_ret@entry=0, xdata=<optimized out>, op_errno=<optimized out>, cookie=0x2) at afr-lk-common.c:243 #21 0x00007fa8e59461e8 in afr_unlock_inodelk_cbk (frame=0x7fa8e005ed78, cookie=0x2, this=0x7fa8e000f820, op_ret=0, op_errno=<optimized out>, xdata=<optimized out>) at afr-lk-common.c:281 #22 0x00007fa8e5c18f3f in client4_0_finodelk_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, myframe=<optimized out>) at client-rpc-fops_v2.c:1398 #23 0x00007fa8ebfe0980 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7fa8e0053800, pollin=pollin@entry=0x7fa8e006e270) at rpc-clnt.c:755 #24 0x00007fa8ebfe0d53 in rpc_clnt_notify (trans=0x7fa8e0053b20, mydata=0x7fa8e0053830, event=<optimized out>, data=0x7fa8e006e270) at rpc-clnt.c:923 #25 0x00007fa8ebfdcd83 in rpc_transport_notify (this=this@entry=0x7fa8e0053b20, event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7fa8e006e270) at rpc-transport.c:525 #26 0x00007fa8e6aae2cb in socket_event_poll_in (notify_handled=true, this=0x7fa8e0053b20) at socket.c:2504 #27 socket_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7fa8e0053b20, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0) at socket.c:2905 #28 0x00007fa8ec2762d9 in event_dispatch_epoll_handler (event=0x7fa8e6637e90, event_pool=0x18420b0) at event-epoll.c:591 #29 event_dispatch_epoll_worker (data=0x1865220) at event-epoll.c:668 #30 0x00007fa8eb99b182 in start_thread (arg=0x7fa8e6638700) at pthread_create.c:312 #31 0x00007fa8eb6c830d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Program received signal SIGSEGV, Segmentation fault. mem_put (ptr=0x7fa8e006ec40) at mem-pool.c:869 869 GF_ATOMIC_DEC(hdr->pool->active); (gdb) so it appears that while stack unwind of the frames following an error response from brick (No space left on device) frame->local includes a memory location that is not allocated via mem_get but via calloc. Breakpoint 1, __gf_calloc (nmemb=nmemb@entry=1, size=size@entry=208, type=type@entry=163, typestr=typestr@entry=0x7fa8e544b892 "gf_wb_mt_wb_inode_t") at mem-pool.c:119 119 return (void *)ptr; $2034 = (void *) 0x7fa8e006ec40 #0 __gf_calloc (nmemb=nmemb@entry=1, size=size@entry=208, type=type@entry=163, typestr=typestr@entry=0x7fa8e544b892 "gf_wb_mt_wb_inode_t") at mem-pool.c:119 #1 0x00007fa8e5441e7b in __wb_inode_create (this=this@entry=0x7fa8e0013470, inode=inode@entry=0x7fa8e00603a8) at write-behind.c:667 #2 0x00007fa8e5442063 in wb_inode_create (this=0x7fa8e0013470, inode=0x7fa8e00603a8) at write-behind.c:708 #3 0x00007fa8e5444fc2 in wb_writev (frame=frame@entry=0x7fa8e005ed78, this=<optimized out>, fd=fd@entry=0x7fa8d000cb88, vector=vector@entry=0x7fa8d000d0b0, count=count@entry=1, offset=offset@entry=0, flags=flags@entry=32769, iobref=iobref@entry=0x7fa8d00160f0, xdata=xdata@entry=0x0) at write-behind.c:1851 #4 0x00007fa8e5231c78 in ra_writev (frame=frame@entry=0x7fa8e005d2d8, this=0x7fa8e0014f50, fd=fd@entry=0x7fa8d000cb88, vector=vector@entry=0x7fa8d000d0b0, count=count@entry=1, offset=offset@entry=0, flags=flags@entry=32769, iobref=iobref@entry=0x7fa8d00160f0, xdata=xdata@entry=0x0) at read-ahead.c:650 #5 0x00007fa8e501f10a in rda_writev (frame=frame@entry=0x7fa8e0000b18, this=0x7fa8e0016940, fd=fd@entry=0x7fa8d000cb88, vector=vector@entry=0x7fa8d000d0b0, count=count@entry=1, off=off@entry=0, flags=flags@entry=32769, iobref=iobref@entry=0x7fa8d00160f0, xdata=xdata@entry=0x0) at readdir-ahead.c:696 #6 0x00007fa8e4e0b2e6 in ioc_writev (frame=frame@entry=0x7fa8e006c0a8, this=0x7fa8e00188d0, fd=fd@entry=0x7fa8d000cb88, vector=vector@entry=0x7fa8d000d0b0, count=count@entry=1, offset=offset@entry=0, flags=flags@entry=32769, iobref=iobref@entry=0x7fa8d00160f0, xdata=xdata@entry=0x0) at io-cache.c:1233 #7 0x00007fa8e4bfabce in qr_writev (frame=frame@entry=0x7fa8e005e808, this=0x7fa8e001a370, fd=fd@entry=0x7fa8d000cb88, iov=iov@entry=0x7fa8d000d0b0, count=count@entry=1, offset=offset@entry=0, flags=flags@entry=32769, iobref=iobref@entry=0x7fa8d00160f0, xdata=xdata@entry=0x0) at quick-read.c:845 #8 0x00007fa8ec2c2240 in default_writev_resume (frame=0x7fa8d000f088, this=0x7fa8e001bde0, fd=0x7fa8d000cb88, vector=0x7fa8d000d0b0, count=1, off=0, flags=32769, iobref=0x7fa8d00160f0, xdata=0x0) at defaults.c:2274 #9 0x00007fa8ec23a66a in call_resume_wind (stub=<optimized out>) at call-stub.c:2040 and contains the inode entry. magik number is there which troubles the mem_put and core dump occurs while trying to derefenrce the pool pointer. The destroyed frame is always ra_truncate_cbk winded from ra_ftruncate e.g. (gdb) p *frame $9 = {root = 0x7fa1a0000ec8, parent = 0x7fa1a000c098, frames = {next = 0x7fa1a000a338, prev = 0x7fa1a000a338}, local = 0x0, this = 0x7fa1ac013470, ret = 0x7fa1b2baedf0 <ra_truncate_cbk>, ref_count = 0, lock = {spinlock = 0, mutex = { __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\377\377\377\377", '\000' <repeats 19 times>, __align = 0}}, cookie = 0x7fa1a000a328, complete = true, op = GF_FOP_FTRUNCATE, begin = {tv_sec = 164325, tv_nsec = 696128864}, end = {tv_sec = 164325, tv_nsec = 698002711}, wind_from = 0x7fa1b2bb86ff <__FUNCTION__.14980> "ra_ftruncate", wind_to = 0x7fa1b2bb8470 "(this->children->xlator)->fops->ftruncate", unwind_from = 0x7fa1b9c48e50 <__FUNCTION__.18620> "call_resume_unwind", unwind_to = 0x7fa1b2bb7fd7 "ra_truncate_cbk"} (gdb) q and the inode ptr is copied to the frame local in the 2237 wb_ftruncate(call_frame_t *frame, xlator_t *this, fd_t *fd, off_t offset, 2238 dict_t *xdata) 2239 { 2240 wb_inode_t *wb_inode = NULL; 2241 call_stub_t *stub = NULL; 2242 int32_t op_errno = 0; 2243 2244 wb_inode = wb_inode_create(this, fd->inode); 2245 if (!wb_inode) { 2246 op_errno = ENOMEM; 2247 goto unwind; 2248 } 2249 2250 frame->local = wb_inode; 2251 2252 stub = fop_ftruncate_stub(frame, wb_ftruncate_helper, fd, offset, xdata); as seen in the gdb: Breakpoint 1, wb_ftruncate (frame=frame@entry=0x7f605c00fc18, this=0x7f6060013470, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at write-behind.c:2239 2239 { (gdb) p frame $1 = (call_frame_t *) 0x7f605c00fc18 (gdb) p frame->local $2 = (void *) 0x0 (gdb) watch frame->local Watchpoint 2: frame->local (gdb) c Continuing. Watchpoint 2: frame->local Old value = (void *) 0x0 New value = (void *) 0x7f606006c540 wb_ftruncate (frame=frame@entry=0x7f605c00fc18, this=<optimized out>, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at write-behind.c:2252 2252 stub = fop_ftruncate_stub(frame, wb_ftruncate_helper, fd, offset, xdata); (gdb) bt #0 wb_ftruncate (frame=frame@entry=0x7f605c00fc18, this=<optimized out>, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at write-behind.c:2252 #1 0x00007f6066c7dfc6 in ra_ftruncate (frame=frame@entry=0x7f605c00a308, this=0x7f6060014f50, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at read-ahead.c:903 #2 0x00007f6066a69627 in rda_ftruncate (frame=frame@entry=0x7f605c00c148, this=0x7f6060016940, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at readdir-ahead.c:835 #3 0x00007f6066856b78 in ioc_ftruncate (frame=frame@entry=0x7f605c00c9b8, this=0x7f60600188d0, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at io-cache.c:1327 #4 0x00007f60666461d1 in qr_ftruncate (frame=frame@entry=0x7f605c00fe98, this=0x7f606001a370, fd=fd@entry=0x7f605c00cbd8, offset=offset@entry=1, xdata=xdata@entry=0x0) at quick-read.c:904 #5 0x00007f606dd0986a in default_ftruncate_resume (frame=0x7f605c0016d8, this=0x7f606001bde0, fd=0x7f605c00cbd8, offset=1, xdata=0x0) at defaults.c:2023 How reproducible: always Steps to Reproduce: - setup a test cluster Volume Name: patchy Type: Replicate Volume ID: 5b79a54d-8207-4d38-a0e2-001adbdd11d8 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 127.1.1.1:/d/backends/1/brick Brick2: 127.1.1.2:/d/backends/2/brick Brick3: 127.1.1.3:/d/backends/3/brick Options Reconfigured: transport.address-family: inet nfs.disable: on performance.client-io-threads: off - fill the volume with dummy files e.g. for i in {1..10}; do dd if=/home/epolakis/prom.img of=/mnt/glusterfs/0/smallfile_IB__N$i bs=4096 count=10000 ; done - use a test code to truncate a file like: #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <memory.h> #define STR_LEN 1000 main() { char *m_string; int fd; int ret; char fn[]="/mnt/glusterfs/0/smallfile_N1"; struct stat st; if ((m_string = (char*) malloc(STR_LEN)) == NULL) perror("malloc() error"); else if ((fd = open(fn, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR)) < 0) perror("open error"); else { memset(m_string, '0', STR_LEN); if ((ret = write(fd, m_string, STR_LEN)) == -1) perror("write() error"); else { if (ftruncate(fd, 1) != 0) perror("ftruncate() error"); else { fstat(fd, &st); printf("the file has %ld bytes\n", (long) st.st_size); } } close(fd); } free(m_string); } Actual results: Expected results: Additional info: If my understanding is correct the inode that is calloc'ed should not released till the forget. --- Additional comment from evangelos on 2019-09-10 10:18:30 UTC --- Would something like below enough ? diff --git a/libglusterfs/src/mem-pool.c b/libglusterfs/src/mem-pool.c index cff9632..fad6a2c 100644 --- a/libglusterfs/src/mem-pool.c +++ b/libglusterfs/src/mem-pool.c @@ -863,6 +863,13 @@ mem_put(void *ptr) /* Not one of ours; don't touch it. */ return; } + + if(!hdr->pool_list){ + gf_msg_callingfn("mem-pool", GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG, + "invalid argument hdr->pool_list NULL"); + return; + } + pool_list = hdr->pool_list; pt_pool = &pool_list->pools[hdr->power_of_two - POOL_SMALLEST]; this (as far as i understood but please correct me) this allocation is for the inode that shouldn't be released unless kernel send the forget (rm/unlint etc). --- Additional comment from Worker Ant on 2019-09-19 11:34:42 UTC --- REVIEW: https://review.gluster.org/23445 (Segmentation fault occurs during truncate) posted (#2) for review on master by Vangelis Polakis --- Additional comment from Nithya Balachandran on 2019-09-24 05:37:47 UTC --- Hi, What version of gluster are you using? --- Additional comment from evangelos on 2019-09-24 07:54:23 UTC --- I have seen in this issue in 5.6 --- Additional comment from Nithya Balachandran on 2019-09-25 04:30:33 UTC --- Thanks. We will take a look and get back to you. --- Additional comment from Nithya Balachandran on 2019-09-25 13:50:34 UTC --- Hi, Thank you for an excellent test. Would you mind if I added it to the regression framework? I'm afraid the patch provided will not fix the issue here. In most xlators, the local structure is allocated from a mempool which is why the mem_put will usually work. However, the wb xlator does not have a defined local struct of its own. Instead, for the truncate and ftruncate fops, it saved the wb_inode (which is allocated using GF_CALLOC) in frame->local. In the normal code path, this is cleared in the wb_truncate_cbk and wb_ftruncate_cbk functions and frame->local is set to NULL before unwinding the fop. The test code, however, does a write followed by a conflicting truncate. As the write fails with ENOSPC, __wb_pick_winds determines that there is a conflict, updates the request op_ret and op_errno for the ftruncate request. The wb_ftruncate_cbk is not called and the req is instead unwound in wb_do_winds -> call_unwind_error_keep_stub code path. This does not clear frame->local which causes the mem_put to crash. --- Additional comment from Nithya Balachandran on 2019-09-25 14:27:30 UTC --- (In reply to evangelos from comment #3) > I have seen in this issue in 5.6 The issue also exists in the master branch. --- Additional comment from Worker Ant on 2019-09-25 14:29:24 UTC --- REVIEW: https://review.gluster.org/23485 (perf/write-behind: Clear frame->local on conflict error) posted (#1) for review on master by N Balachandran --- Additional comment from Worker Ant on 2019-09-26 02:28:52 UTC --- REVIEW: https://review.gluster.org/23485 (perf/write-behind: Clear frame->local on conflict error) merged (#1) on master by N Balachandran
REVIEW: https://review.gluster.org/23488 (perf/write-behind: Clear frame->local on conflict error) posted (#1) for review on release-7 by N Balachandran
REVIEW: https://review.gluster.org/23488 (perf/write-behind: Clear frame->local on conflict error) merged (#2) on release-7 by Rinku Kothiya