Bug 1710744
Summary: | [FUSE] Endpoint is not connected after "Found anomalies" error | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Pavel Znamensky <kompastver> |
Component: | io-cache | Assignee: | Raghavendra G <rgowdapp> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 5 | CC: | bugs, csaba, jahernan, nbalacha, ndevos, rgowdapp |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-11-05 12:44:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Pavel Znamensky
2019-05-16 08:32:57 UTC
I have the coredump file, but due to sensitive information, I can send it directly to the developers only. Adding protocol/client MAINTAINERS to the BZ by looking at the backtrace in the bug description. It might not be related to the client xlator but they can bring it to the attention of other component owners if needed. Have caught it again. Is there any workaround? 2 requests: 1. By upgrading to glusterfs-6.x version you won't even hit the 'dht' (or distribute) code in this scenario. So, the log should go away, and if the reason for such log is the cause of crash, that should get avoided too. 2. please send 'thread apply all bt full' from the coredump. That should help us to see what caused the problem. (You can send the bt to me directly). Amar, thank you for the reply. We're not going to upgrade to glusterfs-6.x in the nearest future. But we'll keep in mind. As for threads backtraces, I've just sent them to you. Thanks. #0 0x00007f4b203fa313 in ?? () from /lib64/libgcc_s.so.1 No symbol table info available. #1 0x00007f4b53edd42c in __GI___dl_iterate_phdr (callback=0x7f4b203fa280, data=0x7f4b4579f210) at dl-iteratephdr.c:76 __clframe = {__cancel_routine = <optimized out>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = <optimized out>} nloaded = 43 ns = <optimized out> caller = <optimized out> l = 0x7f4b40011b60 info = {dlpi_addr = 139961295089664, dlpi_name = 0x7f4b40011b10 "/usr/lib64/glusterfs/5.5/xlator/performance/write-behind.so", dlpi_phdr = 0x7f4b4746b040, dlpi_phnum = 7, dlpi_adds = 43, dlpi_subs = 0, dlpi_tls_modid = 0, dlpi_tls_data = 0x0} ret = 0 #2 0x00007f4b203fabbf in _Unwind_Find_FDE () from /lib64/libgcc_s.so.1 No symbol table info available. #3 0x00007f4b203f7d2c in ?? () from /lib64/libgcc_s.so.1 No symbol table info available. #4 0x00007f4b203f8fb9 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1 No symbol table info available. #5 0x00007f4b53eb4f16 in __GI___backtrace (array=<optimized out>, size=200) at ../sysdeps/x86_64/backtrace.c:107 arg = {array = 0x7f4b4579f5f0, cfa = 139961264899952, cnt = 6, size = 200} once = 6 #6 0x00007f4b559a4620 in _gf_msg_backtrace_nomem () from /lib64/libglusterfs.so.0 No symbol table info available. #7 0x00007f4b559aebd4 in gf_print_trace () from /lib64/libglusterfs.so.0 No symbol table info available. #8 <signal handler called> No locals. #9 uuid_unpack (in=0x8 <Address 0x8 out of bounds>, uu=uu@entry=0x7f4b457a0730) at libuuid/src/unpack.c:43 ptr = 0x9 <Address 0x9 out of bounds> tmp = <error reading variable tmp (Cannot access memory at address 0x8)> #10 0x00007f4b54efc606 in uuid_unparse_x (uu=<optimized out>, out=0x7f4b3c002c20 "dbd277a3-a37f-4b84-aba5-164364b7853f", fmt=0x7f4b54efcba0 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at libuuid/src/unparse.c:55 uuid = {time_low = 0, time_mid = 0, time_hi_and_version = 0, clock_seq = 0, node = "\000\000\000\000\000"} #11 0x00007f4b559add4c in uuid_utoa () from /lib64/libglusterfs.so.0 No symbol table info available. #12 0x00007f4b46e39e15 in ioc_open_cbk () from /usr/lib64/glusterfs/5.5/xlator/performance/io-cache.so No symbol table info available. #13 0x00007f4b476fca51 in dht_open_cbk () from /usr/lib64/glusterfs/5.5/xlator/cluster/distribute.so No symbol table info available. #14 0x00007f4b4796ecea in afr_open_cbk () from /usr/lib64/glusterfs/5.5/xlator/cluster/replicate.so No symbol table info available. #15 0x00007f4b47c461dd in client4_0_open_cbk () from /usr/lib64/glusterfs/5.5/xlator/protocol/client.so No symbol table info available. #16 0x00007f4b55771030 in rpc_clnt_handle_reply () from /lib64/libgfrpc.so.0 No symbol table info available. #17 0x00007f4b55771403 in rpc_clnt_notify () from /lib64/libgfrpc.so.0 No symbol table info available. #18 0x00007f4b5576d2f3 in rpc_transport_notify () from /lib64/libgfrpc.so.0 No symbol table info available. #19 0x00007f4b48d04106 in socket_event_handler () from /usr/lib64/glusterfs/5.5/rpc-transport/socket.so No symbol table info available. #20 0x00007f4b55a08a89 in event_dispatch_epoll_worker () from /lib64/libglusterfs.so.0 No symbol table info available. #21 0x00007f4b545d7dd5 in start_thread (arg=0x7f4b457a1700) at pthread_create.c:307 __res = <optimized out> pd = 0x7f4b457a1700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139961264903936, -2887800249423137699, 0, 8392704, 0, 139961264903936, 2988967441253102685, 2988930292483585117}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = { prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> pagesize_m1 = <optimized out> sp = <optimized out> freesize = <optimized out> #22 0x00007f4b53e9eead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 No locals. Thread 1 (Thread 0x7f4b4866d700 (LWP 10519)): #0 __GI___pthread_mutex_lock (mutex=0x18) at ../nptl/pthread_mutex_lock.c:65 type = <optimized out> id = <optimized out> #1 0x00007f4b559cd1e7 in fd_unref () from /lib64/libglusterfs.so.0 No symbol table info available. #2 0x00007f4b47beb038 in client_local_wipe () from /usr/lib64/glusterfs/5.5/xlator/protocol/client.so No symbol table info available. #3 0x00007f4b47c461ed in client4_0_open_cbk () from /usr/lib64/glusterfs/5.5/xlator/protocol/client.so No symbol table info available. #4 0x00007f4b55771030 in rpc_clnt_handle_reply () from /lib64/libgfrpc.so.0 No symbol table info available. #5 0x00007f4b55771403 in rpc_clnt_notify () from /lib64/libgfrpc.so.0 No symbol table info available. #6 0x00007f4b5576d2f3 in rpc_transport_notify () from /lib64/libgfrpc.so.0 No symbol table info available. #7 0x00007f4b48d04106 in socket_event_handler () from /usr/lib64/glusterfs/5.5/rpc-transport/socket.so No symbol table info available. -------------------------------------------------- Looks like an extra fd_unref() (or inode_unref()) in io-cache xlator?? Looks similar to the crash reported in https://bugzilla.redhat.com/show_bug.cgi?id=1697971#c4 Can you try turning off open-behind and see if you still see the crash? Nithya, it's quite strange, but there were no errors like this since July. The same version, the same properties. I don't know what has been changed. open-behind is enabled. Let's close the issue. In case the error occurs again, I'll reopen it. (In reply to Pavel Znamensky from comment #9) > Nithya, it's quite strange, but there were no errors like this since July. > The same version, the same properties. I don't know what has been changed. > open-behind is enabled. > Let's close the issue. In case the error occurs again, I'll reopen it. Thanks Pavel. I'm going to close it with WorksForMe. Please reopen if you see it again. |