Bug 1774712 - Brick process crashes after few days
Summary: Brick process crashes after few days
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-20 18:55 UTC by Sud
Modified: 2021-03-29 15:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-26 05:52:01 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:
sudsingh: needinfo+
sudsingh: needinfo+


Attachments (Terms of Use)

Description Sud 2019-11-20 18:55:25 UTC
Description of problem:

Brick process crashed after few days.

Version-Release number of selected component (if applicable): 6.5
Brick Logs:
[2019-11-15 14:25:01.769891] I [glusterfsd-mgmt.c:2019:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2019-11-15 17:33:39.466467] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:22077ebe-81d2-4afd-8fb2-84b350864033-GRAPH_ID:0-PID:632477-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 17:33:39.466682] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:22077ebe-81d2-4afd-8fb2-84b350864033-GRAPH_ID:0-PID:632477-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 17:33:49.173936] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 17:33:49.173993] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:8972c9ce-edc9-4e79-94bd-f75919cabe2b-GRAPH_ID:0-PID:43417-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 17:33:49.228481] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:8972c9ce-edc9-4e79-94bd-f75919cabe2b-GRAPH_ID:0-PID:43417-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 17:33:49.228682] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:8972c9ce-edc9-4e79-94bd-f75919cabe2b-GRAPH_ID:0-PID:43417-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 17:33:49.321262] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 17:33:49.321303] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:8032f65a-c523-4210-bce8-c9da8137c88e-GRAPH_ID:0-PID:43529-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 17:45:56.960791] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:8032f65a-c523-4210-bce8-c9da8137c88e-GRAPH_ID:0-PID:43529-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 17:45:56.960995] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:8032f65a-c523-4210-bce8-c9da8137c88e-GRAPH_ID:0-PID:43529-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:02.663281] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2019-11-15 19:55:02.665054] I [glusterfsd-mgmt.c:2019:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2019-11-15 19:55:02.894968] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2019-11-15 19:55:02.921734] I [glusterfsd-mgmt.c:2019:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2019-11-15 19:55:03.154098] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2019-11-15 19:55:03.179969] I [glusterfsd-mgmt.c:2019:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing
[2019-11-15 19:55:03.330235] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:c9e49fb9-ca04-46cb-a722-191e2ea1f3d9-GRAPH_ID:0-PID:646084-HOST:gfs1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:03.330360] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:c9e49fb9-ca04-46cb-a722-191e2ea1f3d9-GRAPH_ID:0-PID:646084-HOST:gfs1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:05.354991] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.141"
[2019-11-15 19:55:05.355022] I [login.c:110:gf_auth] 0-auth/login: allowed user names: b79a4434-e9b8-468b-a5e8-b16f13e51525
[2019-11-15 19:55:05.355047] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:35a321ce-ad83-44a2-a796-6deead4f95be-GRAPH_ID:0-PID:684492-HOST:gfs1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 19:55:05.398300] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:8494fe30-7113-4760-a559-649e9f9908a0-GRAPH_ID:0-PID:53600-HOST:gfs2-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:05.398439] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:8494fe30-7113-4760-a559-649e9f9908a0-GRAPH_ID:0-PID:53600-HOST:gfs2-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:05.399677] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:0a66757e-6d29-486f-b5ad-2c082cddf990-GRAPH_ID:0-PID:788227-HOST:gfs0-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:05.399764] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:0a66757e-6d29-486f-b5ad-2c082cddf990-GRAPH_ID:0-PID:788227-HOST:gfs0-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 19:55:07.421749] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.181"
[2019-11-15 19:55:07.421786] I [login.c:110:gf_auth] 0-auth/login: allowed user names: b79a4434-e9b8-468b-a5e8-b16f13e51525
[2019-11-15 19:55:07.421824] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:135504a8-877f-43a0-83d8-e9e79c71fa07-GRAPH_ID:0-PID:843611-HOST:gfs0-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 19:55:07.423987] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.121"
[2019-11-15 19:55:07.424014] I [login.c:110:gf_auth] 0-auth/login: allowed user names: b79a4434-e9b8-468b-a5e8-b16f13e51525
[2019-11-15 19:55:07.424030] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:f0c092a8-117b-4169-b159-17f462340638-GRAPH_ID:0-PID:106832-HOST:gfs2-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 20:01:17.797092] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 20:01:17.797155] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:063f7a90-4dee-47ca-9bee-9eccb810b83c-GRAPH_ID:0-PID:64959-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 20:01:17.828340] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:063f7a90-4dee-47ca-9bee-9eccb810b83c-GRAPH_ID:0-PID:64959-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:01:17.828547] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:063f7a90-4dee-47ca-9bee-9eccb810b83c-GRAPH_ID:0-PID:64959-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:01:17.926493] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 20:01:17.926581] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:6112e3eb-b22f-4fc0-8fb5-28781c5ea22b-GRAPH_ID:0-PID:65092-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 20:01:48.269565] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:6112e3eb-b22f-4fc0-8fb5-28781c5ea22b-GRAPH_ID:0-PID:65092-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:01:48.269782] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:6112e3eb-b22f-4fc0-8fb5-28781c5ea22b-GRAPH_ID:0-PID:65092-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:02:48.213821] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 20:02:48.213895] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:5887ed16-8f13-4f2a-ad8e-6fe971c6f51e-GRAPH_ID:0-PID:77965-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-15 20:02:48.256466] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-test-server: disconnecting connection from CTX_ID:5887ed16-8f13-4f2a-ad8e-6fe971c6f51e-GRAPH_ID:0-PID:77965-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:02:48.256667] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-test-server: Shutting down connection CTX_ID:5887ed16-8f13-4f2a-ad8e-6fe971c6f51e-GRAPH_ID:0-PID:77965-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0
[2019-11-15 20:02:48.351022] I [addr.c:54:compare_addr_and_update] 0-/var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick: allowed = "*", received addr = "10.180.72.74"
[2019-11-15 20:02:48.351069] I [MSGID: 115029] [server-handshake.c:550:server_setvolume] 0-test-server: accepted client from CTX_ID:37cb0888-7e3a-4bcc-98f4-fb0d6df068d8-GRAPH_ID:0-PID:78096-HOST:storage-datap-1-PC_NAME:test-client-0-RECON_NO:-0 (version: 6.5) with subvol /var/lib/heketi/mounts/vg_1281b98d059c2dbde3b0379fb0b284a6/brick_5419fbc483676eb04e13efaed26f55fb/brick
[2019-11-18 23:35:15.014118] E [MSGID: 101064] [event-epoll.c:618:event_dispatch_epoll_handler] 0-epoll: generation mismatch on idx=9, gen=10, slot->gen=11, slot->fd=16
[2019-11-18 23:35:15.116207] E [socket.c:2252:__socket_read_frag] 0-rpc: wrong MSG-TYPE (9) received from 10.176.0.139:58950
[2019-11-18 23:35:15.255633] E [socket.c:1303:socket_event_poll_err] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7f62c) [0x7f791f05e62c] -->/usr/lib/x86_64-linux-gnu/glusterfs/6.5/rpc-transport/socket.so(+0x9bbf) [0x7f7919a7abbf] -->/usr/lib/x86_64-linux-gnu/glusterfs/6.5/rpc-transport/socket.so(+0x79cb) [0x7f7919a789cb] ) 0-socket: invalid argument: this->private [Invalid argument]
[2019-11-18 23:35:24.203114] E [MSGID: 101064] [event-epoll.c:618:event_dispatch_epoll_handler] 0-epoll: generation mismatch on idx=9, gen=19, slot->gen=20, slot->fd=16
[2019-11-18 23:35:24.217318] E [MSGID: 101064] [event-epoll.c:618:event_dispatch_epoll_handler] 0-epoll: generation mismatch on idx=9, gen=22, slot->gen=23, slot->fd=16
[2019-11-18 23:35:28.497697] E [socket.c:2252:__socket_read_frag] 0-rpc: wrong MSG-TYPE (-66911361) received from 10.176.0.139:58594
[2019-11-18 23:35:36.326252] E [socket.c:2252:__socket_read_frag] 0-rpc: wrong MSG-TYPE (9109504) received from 10.176.0.139:51188
[2019-11-18 23:38:17.784843] C [rpcsvc.c:1029:rpcsvc_notify] 0-rpcsvc: got MAP_XID event, which should have not come
pending frames:
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2019-11-18 23:40:40
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.5
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x224ea)[0x7f791f0014ea]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x2e7)[0x7f791f00bd17]
/lib/x86_64-linux-gnu/libc.so.6(+0x354b0)[0x7f791e3cf4b0]
/usr/lib/x86_64-linux-gnu/glusterfs/6.5/rpc-transport/socket.so(+0x9c04)[0x7f7919a7ac04]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7f62c)[0x7f791f05e62c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f791e76b6ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f791e4a141d]
---------
.

Comment 1 Sanju 2019-11-22 10:03:25 UTC
Please share output of "bt" and "t a a bt". If possible, please share the coredump with us.

Thanks,
Sanju

Comment 2 Sanju 2019-12-26 05:52:01 UTC
As there is sufficient data provided, closing this bug as insufficient data. Please, feel free to re-open the bug if you happen to hit this issue .


Note You need to log in before you can comment on or make changes to this bug.