| Summary: | booster NFS rexporting distribute volume doesn't respond | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Harshavardhana <fharshav> |
| Component: | booster | Assignee: | Shehjar Tikoo <shehjart> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | mainline | CC: | amarts, cww, gluster-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
some more debugging show'd
(gdb) p *(sp_cache_t *)0x7fc3480042f0
$2 = {table = 0x1837190, expected_offset = 0, lock = 1, miss = 0, hits = 0, ref = 1}
(gdb) fr 0
#0 sp_cache_remove_entry (cache=0x7fc3480042f0, name=<value optimized out>, remove_all=<value optimized out>)
at stat-prefetch.c:390
390 if (this->private == NULL)
(gdb) p *this
Cannot access memory at address 0x100000000
(gdb) l
385 this = THIS;
386
387 if (this == NULL)
388 goto out;
389
390 if (this->private == NULL)
391 goto out;
392
393 priv = this->private;
394
THIS seems to be wrong :O
This is with ERROR logging. [2009-12-05 03:03:13] E [socket.c:760:socket_connect_finish] availtvn4-3: connection to 192.168.101.163:10002 failed (Connection refused) [2009-12-05 03:03:13] E [socket.c:760:socket_connect_finish] availtvn4-4: connection to 192.168.101.163:10002 failed (Connection refused) [2009-12-05 03:03:13] E [socket.c:760:socket_connect_finish] availtvn4-4: connection to 192.168.101.163:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-1: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-1: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-2: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-2: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-3: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 03:05:26] E [socket.c:760:socket_connect_finish] availtvn1-3: connection to 192.168.101.160:10002 failed (Connection refused) [2009-12-05 05:02:37] E [booster.c:2807:chdir] booster: chdir failed: Structure needs cleaning while the same configuration with "DEBUG" logging [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-1: Connected to 192.168.101.161:10002, attached to remote volume 'brick1'. [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-1: Connected to 192.168.101.161:10002, attached to remote volume 'brick1'. [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-2: Connected to 192.168.101.161:10002, attached to remote volume 'brick2'. [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-2: Connected to 192.168.101.161:10002, attached to remote volume 'brick2'. [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-3: Connected to 192.168.101.161:10002, attached to remote volume 'brick3'. [2009-12-05 05:05:54] N [client-protocol.c:6252:client_setvolume_cbk] availtvn2-3: Connected to 192.168.101.161:10002, attached to remote volume 'brick3'. [2009-12-05 05:05:54] D [dht-diskusage.c:71:dht_du_info_cbk] distribute: on subvolume 'availtvn2-1': avail_percent is: 99.00 and avail_space is: 13630190026752 [2009-12-05 05:05:54] D [dht-diskusage.c:71:dht_du_info_cbk] distribute: on subvolume 'availtvn2-1': avail_percent is: 99.00 and avail_space is: 13630190026752 [2009-12-05 05:05:54] D [dht-diskusage.c:71:dht_du_info_cbk] distribute: on subvolume 'availtvn2-2': avail_percent is: 99.00 and avail_space is: 14040060981248 [2009-12-05 05:05:54] D [dht-diskusage.c:71:dht_du_info_cbk] distribute: on subvolume 'availtvn2-3': avail_percen In first case with LOG message error it didn't segfault but in second case. it segfaulted. without statprefetch NFS worked fine.. seems likes ld-preload is screwing up some pthreads APIs and THIS (which is based on pthreads) is getting screwed up.. PATCH: http://patches.gluster.com/patch/2582 in master (THIS: set THIS pointers before forget/release/releasedir callbacks) |
Backtrace from the gdb attached to unfsd #0 sp_cache_remove_entry (cache=0x7fc3480042f0, name=<value optimized out>, remove_all=<value optimized out>) at stat-prefetch.c:390 #1 0x00007fc34e5fc610 in sp_cache_free (cache=0x0) at stat-prefetch.c:462 #2 0x00007fc34e5fc662 in sp_fd_ctx_free (fd_ctx=0x7fc3480055d0) at stat-prefetch.c:530 #3 0x00007fc34e5fc6c8 in sp_release (this=0x182cd80, fd=<value optimized out>) at stat-prefetch.c:3842 #4 0x00000031e102c817 in fd_destroy (fd=<value optimized out>) at fd.c:406 #5 fd_unref (fd=<value optimized out>) at fd.c:448 #6 0x00007fc34fcea10a in glusterfs_closedir (dirfd=0x1837140) at libglusterfsclient.c:5680 #7 0x00007fc34ff14285 in closedir (dh=<value optimized out>) at booster.c:1864 #8 0x000000000040c7a4 in inet_addr () #9 0x0000000000406a33 in inet_addr () #10 0x00000000004092ba in inet_addr () #11 0x0000000000403b69 in inet_addr () #12 0x0000003b9290a8a9 in svc_getreq_common_internal () from /lib64/libc.so.6 #13 0x0000003b9290a231 in svc_getreq_poll_internal () from /lib64/libc.so.6 #14 0x000000000040479b in inet_addr () #15 0x0000003b9281ea2d in __libc_start_main () from /lib64/libc.so.6 #16 0x0000000000402b79 in inet_addr ()