| Summary: | [289c2902d6a81f7a5b6da04c24cc955bd5427178] client crash with segfault at inode_link in inode.c | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Rahul C S <rahulcs> | ||||
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | mainline | CC: | gluster-bugs, vbellur, vinaraya | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-04-17 13:46:44 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Could you reproduce this using valgrind?. No did not get it again. Will try one more time. Not reproducible anymore. Shwetha has tested this several times. |
Created attachment 546245 [details] glusterfs logs Description of problem: Crash happened after running dbench second time with xattr option enabled while a stat had been called on the mountpoint from another client. A brick process up & down was done before the stat as well. Core was generated by `/usr/local/sbin/glusterfs --volfile-id=vol --volfile-server=dagobah mount/'. Program terminated with signal 11, Segmentation fault. #0 __pthread_mutex_lock (mutex=0x100000000) at pthread_mutex_lock.c:50 50 pthread_mutex_lock.c: No such file or directory. in pthread_mutex_lock.c (gdb) bt #0 __pthread_mutex_lock (mutex=0x100000000) at pthread_mutex_lock.c:50 #1 0x00007f9a72352c80 in inode_link (inode=0x1905900, parent=0x7f9a595a604c, name=0x7f9a600600f1 "clients", iatt=0x7f9a5ac5742c) at ../../../libglusterfs/src/inode.c:812 #2 0x00007f9a703edcac in fuse_entry_cbk (frame=0x7f9a706b4878, cookie=0x7f9a7092adfc, this=0x1895e30, op_ret=0, op_errno=22, inode=0x1905900, buf=0x7f9a5ac5742c) at ../../../../../xlators/mount/fuse/src/fuse-bridge.c:195 #3 0x00007f9a703ee1ec in fuse_lookup_cbk (frame=0x7f9a706b4878, cookie=0x7f9a7092adfc, this=0x1895e30, op_ret=0, op_errno=22, inode=0x1905900, stat=0x7f9a5ac5742c, dict=0x7f9a5acd404c, postparent=0x7f9a7092adfc) at ../../../../../xlators/mount/fuse/src/fuse-bridge.c:280 #4 0x00007f9a6e3285e0 in client3_1_xattrop_cbk (req=0x7f9a5acd404c, iov=0x7f9a5acd408c, count=1, myframe=0x7f9a7092adfc) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1425 #5 0x00007f9a721189c6 in rpc_clnt_handle_reply (clnt=0x7f9a6009c800, pollin=0x190e580) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789 #6 0x00007f9a72118d28 in rpc_clnt_notify (trans=0x7f9a6009cad0, mydata=0x7f9a6009c830, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x190e580) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:908 #7 0x00007f9a72114e3d in rpc_transport_notify (this=0x7f9a6009cad0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x190e580) at ../../../../rpc/rpc-lib/src/rpc-transport.c:498 #8 0x00007f9a6ef8a359 in socket_event_poll_in (this=0x7f9a6009cad0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675 #9 0x00007f9a6ef8a8cd in socket_event_handler (fd=43, idx=23, data=0x7f9a6009cad0, poll_in=1, poll_out=0, poll_err=0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790 #10 0x00007f9a723694b9 in event_dispatch_epoll_handler (event_pool=0x18952d0, events=0x189aa40, i=0) at ../../../libglusterfs/src/event.c:794 #11 0x00007f9a723696d3 in event_dispatch_epoll (event_pool=0x18952d0) at ../../../libglusterfs/src/event.c:856 #12 0x00007f9a72369a45 in event_dispatch (event_pool=0x18952d0) at ../../../libglusterfs/src/event.c:956 #13 0x0000000000407d83 in main (argc=4, argv=0x7fff4c81e6f8) at ../../../glusterfsd/src/glusterfsd.c:1601 (gdb) info thr 10 Thread 7365 0x00007f9a718174f9 in __libc_readv (fd=5, vector=0x7f9a671d2e50, count=2) at ../sysdeps/unix/sysv/linux/readv.c:56 9 Thread 7549 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 8 Thread 7364 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 7 Thread 7671 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 6 Thread 7363 0x00007f9a71add4bd in nanosleep () at ../sysdeps/unix/syscall-template.S:82 5 Thread 7658 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 4 Thread 7360 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 3 Thread 7562 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 2 Thread 7359 do_sigwait (set=<value optimized out>, sig=0x7f9a703e3eb8) at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:65 * 1 Thread 7358 __pthread_mutex_lock (mutex=0x100000000) at pthread_mutex_lock.c:50 (gdb) f 1 #1 0x00007f9a72352c80 in inode_link (inode=0x1905900, parent=0x7f9a595a604c, name=0x7f9a600600f1 "clients", iatt=0x7f9a5ac5742c) at ../../../libglusterfs/src/inode.c:812 812 pthread_mutex_lock (&table->lock); (gdb) l 807 return NULL; 808 } 809 810 table = inode->table; 811 812 pthread_mutex_lock (&table->lock); 813 { 814 linked_inode = __inode_link (inode, parent, name, iatt); 815 816 if (linked_inode) (gdb) p *inode $1 = {table = 0x100000000, gfid = "\002\000\000\000\001\000\000\000p\201\221\001\000\000\000", lock = 25984256, nlookup = 26229184, ref = 0, ia_type = IA_INVAL, fd_list = {next = 0x1, prev = 0xbaadf00d}, dentry_list = {next = 0x0, prev = 0xd1}, hash = {next = 0x9800000057, prev = 0x6008c29000000000}, list = {next = 0xcafebabe00007f9a, prev = 0x0}, _ctx = 0xd942de7300000001} (gdb) f 3 #3 0x00007f9a703ee1ec in fuse_lookup_cbk (frame=0x7f9a706b4878, cookie=0x7f9a7092adfc, this=0x1895e30, op_ret=0, op_errno=22, inode=0x1905900, stat=0x7f9a5ac5742c, dict=0x7f9a5acd404c, postparent=0x7f9a7092adfc) at ../../../../../xlators/mount/fuse/src/fuse-bridge.c:280 280 fuse_entry_cbk (frame, cookie, this, op_ret, op_errno, inode, stat); (gdb) p *state $8 = {pool = 0x18953a0, this = 0x1895e30, itable = 0x0, loc = {path = 0x7f9a600600f0 "/clients", name = 0x7f9a600600f1 "clients", inode = 0x7f9a595a60e0, parent = 0x7f9a595a604c, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, loc2 = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, finh = 0x7f9a60006860, flags = 0, off = 0, size = 0, nlookup = 0, fd = 0x0, dict = 0x0, name = 0x0, is_revalidate = 1 '\001', truncate_needed = _gf_false, lock = 1, lk_owner = 0, resolve = {type = 0, ino = 0, gen = 0, par = 0, fd = 0x0, path = 0x7f9a60034170 "/clients", bname = 0x7f9a6000b3b0 "clients", gfid = "\275,MY$fI;\275z\202\210\212ϸF", pargfid = '\000' <repeats 15 times>, "\001", resolved = 0x0, op_ret = 0, op_errno = 0, deep_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, components = 0x0, comp_count = 0}, resolve2 = {type = 0, ino = 0, gen = 0, par = 0, fd = 0x0, path = 0x0, bname = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>, resolved = 0x0, op_ret = 0, op_errno = 0, deep_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, components = 0x0, comp_count = 0}, loc_now = 0x0, resolve_now = 0x0, resume_fn = 0x7f9a703ee1f3, valid = 0, mask = 0, rdev = 0, mode = 0, attr = { ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, lk_lock = {l_type = 0, l_whence = 0, l_start = 0, l_len = 0, l_pid = 0, l_owner = 0}, vector = {iov_base = 0x0, iov_len = 0}, gfid = '\000' <repeats 15 times>} (gdb) p *state->loc.inode $9 = {table = 0x7f9a68007e30, gfid = "\275,MY$fI;\275z\202\210\212ϸF", lock = 1, nlookup = 0, ref = 43, ia_type = IA_IFDIR, fd_list = { next = 0x7f9a595a6110, prev = 0x7f9a595a6110}, dentry_list = {next = 0x7f9a5945504c, prev = 0x7f9a5945504c}, hash = {next = 0x7f9a5940c490, prev = 0x7f9a5940c490}, list = {next = 0x7f9a595a60ac, prev = 0x7f9a595aa0d8}, _ctx = 0x18fac50} (gdb) p *state->loc.inode->table $10 = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, hashsize = 14057, name = 0x7f9a6803b7e0 "vol/inode", root = 0x7f9a595a604c, xl = 0x7f9a6008c290, lru_limit = 0, inode_hash = 0x7f9a59354030, name_hash = 0x7f9a5931d030, active = {next = 0x7f9a595abe54, prev = 0x7f9a595a60ac}, active_size = 44, lru = { next = 0x7f9a68007ea8, prev = 0x7f9a68007ea8}, lru_size = 0, purge = {next = 0x7f9a68007ec0, prev = 0x7f9a68007ec0}, purge_size = 0, inode_pool = 0x7f9a68076cf0, dentry_pool = 0x7f9a680212b0, fd_mem_pool = 0x7f9a68040cc0} After a lot of setxattr & getxattr failed calls, i am getting [2011-12-13 18:17:50.013317] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-5: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.017972] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-4: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.070850] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-0: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.075770] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-1: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.283978] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-2: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.412274] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-0: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.415153] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-1: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.448554] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-2: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:50.581526] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-2: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:51.614044] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-2: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:51.646347] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-1: remote operation failed: No data available. Path: (null) [2011-12-13 18:17:51.646820] W [client3_1-fops.c:899:client3_1_getxattr_cbk] 4-vol-client-0: remote operation failed: No data available. Path: (null) these in the client log file before the crash How reproducible: Not so frequent Steps to Reproduce: 1. Distributed replicate volume (3x2) 2. Enabled & Disabled quota/georep with stat-prefetch on/off to do some graph changes 3. From a fuse client, ran dbench -s -F -S --stat-check 10. 4. Killed a process & brought it back up. 5. From another fuse client ran find . | xargs stat 6. From the first fuse client, i ran dbench -s -F -S -x --one-byte-write-fix --stat-check 10. 7. This crashed the client Actual results: The dbench test errors out & client crashed. Expected results: The dbench test should complete. Additional info: I have attached the logs for bricks & clients.