Description of problem: ======================= Had glusterfs-3.8.4-52.4 build, 3node cluster setup, 2*2 volume. Ran a command for enabling bitrot, sleeping for a while and disabling bitrot for the same volume. Bt is given below. Sosreports and the loop command logfile will be copied to http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber> Command that was run: for i in {1..10000}; do echo "===== $i =====" >> bitrot_enable_disable.log; gluster v bitrot vol1 enable |& tee -a bitrot_enable_disable.log; sleep 5; gluster v bitrot vol1 disable |& tee -a bitrot_enable_disable.log ; done Bt: (gdb) bt #0 0x00007f602b5c50ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f602bb5f89c in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7f6018077420, mydata=mydata@entry=0x7f6018001b60, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:6300 #2 0x00007f602bb55c4a in glusterd_big_locked_notify (rpc=0x7f6018077420, mydata=0x7f6018001b60, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7f602bb5f850 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:70 #3 0x00007f6036e779eb in rpc_clnt_handle_disconnect (conn=0x7f6018077450, clnt=0x7f6018077420) at rpc-clnt.c:892 #4 rpc_clnt_notify (trans=<optimized out>, mydata=0x7f6018077450, event=<optimized out>, data=0x7f601806ffb0) at rpc-clnt.c:955 #5 0x00007f6036e739e3 in rpc_transport_notify (this=this@entry=0x7f601806ffb0, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7f601806ffb0) at rpc-transport.c:538 #6 0x00007f6028fc1a4f in socket_event_poll_err (idx=<optimized out>, gen=<optimized out>, this=0x7f601806ffb0) at socket.c:1197 #7 socket_event_handler (fd=23, idx=<optimized out>, gen=<optimized out>, data=0x7f601806ffb0, poll_in=<optimized out>, poll_out=0, poll_err=0) at socket.c:2467 #8 0x00007f6037109846 in event_dispatch_epoll_handler (event=0x7f6026e08e80, event_pool=0x560b3087a860) at event-epoll.c:572 #9 event_dispatch_epoll_worker (data=0x560b3090cc50) at event-epoll.c:648 #10 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f60357da34d in clone () from /lib64/libc.so.6 (gdb) t a a bt Thread 9 (Thread 0x7f6037589780 (LWP 1608)): #0 0x00007f6035f0ef57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f6037109be0 in event_dispatch_epoll (event_pool=0x560b3087a860) at event-epoll.c:732 #2 0x0000560b3056bef3 in main (argc=5, argv=<optimized out>) at glusterfsd.c:2489 Thread 8 (Thread 0x7f6026608700 (LWP 15431)): #0 0x00007f60357d47f9 in syscall () from /lib64/libc.so.6 #1 0x00007f602b5c4c14 in call_rcu_thread () from /lib64/liburcu-bp.so.1 #2 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f602ceb5700 (LWP 1612)): #0 0x00007f6035f11cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f60370e7a28 in syncenv_task (proc=proc@entry=0x560b30882940) at syncop.c:603 #2 0x00007f60370e8870 in syncenv_processor (thdata=0x560b30882940) at syncop.c:695 #3 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f602d6b6700 (LWP 1611)): #0 0x00007f60357a11ad in nanosleep () from /lib64/libc.so.6 #1 0x00007f60357a1044 in sleep () from /lib64/libc.so.6 #2 0x00007f60370d544d in pool_sweeper (arg=<optimized out>) at mem-pool.c:464 #3 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f602760a700 (LWP 1858)): #0 0x00007f6035f11945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f602bc0b74b in hooks_worker (args=<optimized out>) at glusterd-hooks.c:530 #2 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f602deb7700 (LWP 1610)): #0 0x00007f602b1284d8 in _fini () from /lib64/liblvm2app.so.2.2 #1 0x00007f6037389b58 in _dl_fini () from /lib64/ld-linux-x86-64.so.2 #2 0x00007f603571aa69 in __run_exit_handlers () from /lib64/libc.so.6 #3 0x00007f603571aab5 in exit () from /lib64/libc.so.6 #4 0x0000560b3056f014 in cleanup_and_exit (signum=15) at glusterfsd.c:1367 #5 0x0000560b3056f115 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2087 #6 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #7 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f602c6b4700 (LWP 1613)): #0 0x00007f6035f11cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f60370e7a28 in syncenv_task (proc=proc@entry=0x560b30882d00) at syncop.c:603 #2 0x00007f60370e8870 in syncenv_processor (thdata=0x560b30882d00) at syncop.c:695 #3 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f602e6b8700 (LWP 1609)): #0 0x00007f6035f14e4d in nanosleep () from /lib64/libpthread.so.0 #1 0x00007f60370bbf7e in gf_timer_proc (data=0x560b308820f0) at timer.c:176 #2 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f60357da34d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f6026e09700 (LWP 1859)): #0 0x00007f602b5c50ad in rcu_read_lock_bp () from /lib64/liburcu-bp.so.1 #1 0x00007f602bb5f89c in __glusterd_peer_rpc_notify (rpc=rpc@entry=0x7f6018077420, mydata=mydata@entry=0x7f6018001b60, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:6300 #2 0x00007f602bb55c4a in glusterd_big_locked_notify (rpc=0x7f6018077420, mydata=0x7f6018001b60, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7f602bb5f850 <__glusterd_peer_rpc_notify>) at glusterd-handler.c:70 #3 0x00007f6036e779eb in rpc_clnt_handle_disconnect (conn=0x7f6018077450, clnt=0x7f6018077420) at rpc-clnt.c:892 #4 rpc_clnt_notify (trans=<optimized out>, mydata=0x7f6018077450, event=<optimized out>, data=0x7f601806ffb0) ---Type <return> to continue, or q <return> to quit--- at rpc-clnt.c:955 #5 0x00007f6036e739e3 in rpc_transport_notify (this=this@entry=0x7f601806ffb0, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7f601806ffb0) at rpc-transport.c:538 #6 0x00007f6028fc1a4f in socket_event_poll_err (idx=<optimized out>, gen=<optimized out>, this=0x7f601806ffb0) at socket.c:1197 #7 socket_event_handler (fd=23, idx=<optimized out>, gen=<optimized out>, data=0x7f601806ffb0, poll_in=<optimized out>, poll_out=0, poll_err=0) at socket.c:2467 #8 0x00007f6037109846 in event_dispatch_epoll_handler (event=0x7f6026e08e80, event_pool=0x560b3087a860) at event-epoll.c:572 #9 event_dispatch_epoll_worker (data=0x560b3090cc50) at event-epoll.c:648 #10 0x00007f6035f0de25 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f60357da34d in clone () from /lib64/libc.so.6 (gdb) Version-Release number of selected component (if applicable): ============================================================ glusterfs-3.8.4-52.4.el7rhgs.x86_64 How reproducible: ================= Hit it once
[root@dhcp47-37 ~]# gluster v info Volume Name: vol1 Type: Distributed-Replicate Volume ID: 1fe4dfcf-5d36-4a0d-860d-2a1f5352f368 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.47.37:/bricks/brick0/vol1_0 Brick2: 10.70.47.75:/bricks/brick0/vol1_1 Brick3: 10.70.47.37:/bricks/brick1/vol1_2 Brick4: 10.70.47.75:/bricks/brick1/vol1_3 Options Reconfigured: features.scrub-freq: hourly features.scrub: Inactive features.bitrot: off transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable [root@dhcp47-37 ~]# [root@dhcp47-37 ~]# gluster peer status Number of Peers: 2 Hostname: dhcp47-75.lab.eng.blr.redhat.com Uuid: 51b0fa52-9cbc-4052-a785-9ac890bc1417 State: Peer in Cluster (Connected) Hostname: dhcp47-188.lab.eng.blr.redhat.com Uuid: 9914c2ea-7ecf-463a-9119-c457cc6497bf State: Peer in Cluster (Connected) [root@dhcp47-37 ~]# [root@dhcp47-37 ~]# rpm -qa | grep gluster vdsm-gluster-4.17.33-1.2.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.5.x86_64 glusterfs-server-3.8.4-52.4.el7rhgs.x86_64 gluster-block-0.2.1-14.el7rhgs.x86_64 glusterfs-fuse-3.8.4-52.4.el7rhgs.x86_64 glusterfs-debuginfo-3.8.4-52.4.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-libs-3.8.4-52.4.el7rhgs.x86_64 glusterfs-api-3.8.4-52.4.el7rhgs.x86_64 glusterfs-events-3.8.4-52.4.el7rhgs.x86_64 glusterfs-3.8.4-52.4.el7rhgs.x86_64 glusterfs-cli-3.8.4-52.4.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-52.4.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-52.4.el7rhgs.x86_64 glusterfs-rdma-3.8.4-52.4.el7rhgs.x86_64 gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64 python-gluster-3.8.4-52.4.el7rhgs.noarch [root@dhcp47-37 ~]# [root@dhcp47-37 ~]# gluster v status Status of volume: vol1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.37:/bricks/brick0/vol1_0 49152 0 Y 14521 Brick 10.70.47.75:/bricks/brick0/vol1_1 49152 0 Y 19376 Brick 10.70.47.37:/bricks/brick1/vol1_2 49152 0 Y 14521 Brick 10.70.47.75:/bricks/brick1/vol1_3 49152 0 Y 19376 Self-heal Daemon on localhost N/A N/A Y 14554 Self-heal Daemon on dhcp47-188.lab.eng.blr. redhat.com N/A N/A Y 29462 Self-heal Daemon on dhcp47-75.lab.eng.blr.r edhat.com N/A N/A Y 19409 Task Status of Volume vol1 ------------------------------------------------------------------------------ There are no active volume tasks [root@dhcp47-37 ~]# [qe@rhsqe-repo 1530936]$ [qe@rhsqe-repo 1530936]$ hostname rhsqe-repo.lab.eng.blr.redhat.com [qe@rhsqe-repo 1530936]$ [qe@rhsqe-repo 1530936]$ pwd /home/repo/sosreports/1530936 [qe@rhsqe-repo 1530936]$ [qe@rhsqe-repo 1530936]$ ll total 188232 -rwxr-xr-x. 1 qe qe 334372 Jan 4 14:23 bitrot_enable_disable.log -rwxr-xr-x. 1 qe qe 109375488 Jan 4 14:22 core.1608 -rwxr-xr-x. 1 qe qe 27451476 Jan 4 14:22 sosreport-dhcp47-188.lab.eng.blr.redhat.com-20180104014649.tar.xz -rwxr-xr-x. 1 qe qe 26319864 Jan 4 14:22 sosreport-dhcp47-37.lab.eng.blr.redhat.com-20180104014628.tar.xz -rwxr-xr-x. 1 qe qe 29259920 Jan 4 14:22 sosreport-dhcp47-75.lab.eng.blr.redhat.com-20180104014641.tar.xz [qe@rhsqe-repo 1530936]$