Description of problem: One of the volume brick is not up after restarting the pod. sh-4.2# rpm -qa |grep glusterfs glusterfs-fuse-3.8.4-39.el7rhgs.x86_64 glusterfs-server-3.8.4-39.el7rhgs.x86_64 glusterfs-libs-3.8.4-39.el7rhgs.x86_64 glusterfs-3.8.4-39.el7rhgs.x86_64 glusterfs-api-3.8.4-39.el7rhgs.x86_64 glusterfs-cli-3.8.4-39.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-39.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-39.el7rhgs.x86_64 sh-4.2# sh-4.2# gluster v status Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.35.5:/var/lib/heketi/mounts/v g_e2c07dc7b49f9f42ca327a3073966364/brick_7a c173229b1e69e8730be424956846e4/brick 49152 0 Y 524 Brick 192.168.35.2:/var/lib/heketi/mounts/v g_11f9761be704eb359ffd71c4d76afdf1/brick_04 0ff4a25411b19d2b24d0340bf3e9f8/brick 49152 0 Y 403 Brick 192.168.35.6:/var/lib/heketi/mounts/v g_1438320d4e7bde3c3d2f232671afd699/brick_1f 0d216ac0f28f1f5b0fbbc4adcdc69b/brick 49152 0 Y 410 Self-heal Daemon on localhost N/A N/A Y 399 Self-heal Daemon on 192.168.35.5 N/A N/A Y 514 Self-heal Daemon on 192.168.35.6 N/A N/A Y 435 Self-heal Daemon on 192.168.35.4 N/A N/A Y 412 Self-heal Daemon on 192.168.35.2 N/A N/A Y 412 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_96d8648ebd2ff7dcb87be3a2587c6246 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.35.6:/var/lib/heketi/mounts/v g_9ed2ce0c55973e60d0ffc25d0028157f/brick_90 862eaeb677759f02e0809786e8ec70/brick 49153 0 Y 416 Brick 192.168.35.3:/var/lib/heketi/mounts/v g_803bb877441ad72c82b0ee6113a79511/brick_9a 7f3517851a71e13bc3ff1f3b96f4bc/brick 49152 0 Y 408 Brick 192.168.35.4:/var/lib/heketi/mounts/v g_799ba6da52ca29cf5404bd96b53fa957/brick_3f ed005da1f6746d2d299b1e971fcc39/brick 49152 0 Y 403 Brick 192.168.35.6:/var/lib/heketi/mounts/v g_1438320d4e7bde3c3d2f232671afd699/brick_61 5a892b042f50617e4d228e0971c6ce/brick 49153 0 Y 416 Brick 192.168.35.4:/var/lib/heketi/mounts/v g_3ae4bb7cd25740f3cd84a15dbb985f48/brick_bb 9eb06eb20f6d570d4d2f6378b14742/brick 49152 0 Y 403 Brick 192.168.35.5:/var/lib/heketi/mounts/v g_96d6ff00fd2a38ead4c9ef9cd6db935f/brick_1a 6dcabc43a70402dc595ed89d76126f/brick 49153 0 Y 531 Self-heal Daemon on localhost N/A N/A Y 399 Self-heal Daemon on 192.168.35.4 N/A N/A Y 412 Self-heal Daemon on 192.168.35.2 N/A N/A Y 412 Self-heal Daemon on 192.168.35.5 N/A N/A Y 514 Self-heal Daemon on 192.168.35.6 N/A N/A Y 435 Task Status of Volume vol_96d8648ebd2ff7dcb87be3a2587c6246 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_bd43b5b5f048c469ee16f80376d73ec3 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.35.3:/var/lib/heketi/mounts/v g_803bb877441ad72c82b0ee6113a79511/brick_80 9331b5ef0e23c3327cf4483258dd25/brick N/A N/A N N/A Brick 192.168.35.4:/var/lib/heketi/mounts/v g_799ba6da52ca29cf5404bd96b53fa957/brick_7d 5d8d5c3074f965d456c6d1fbfbc54d/brick 49153 0 Y 407 Brick 192.168.35.2:/var/lib/heketi/mounts/v g_11f9761be704eb359ffd71c4d76afdf1/brick_54 dd82e9703966757b5c8f9f332c998b/brick 49152 0 Y 403 Self-heal Daemon on localhost N/A N/A Y 399 Self-heal Daemon on 192.168.35.5 N/A N/A Y 514 Self-heal Daemon on 192.168.35.6 N/A N/A Y 435 Self-heal Daemon on 192.168.35.2 N/A N/A Y 412 Self-heal Daemon on 192.168.35.4 N/A N/A Y 412 Task Status of Volume vol_bd43b5b5f048c469ee16f80376d73ec3 ------------------------------------------------------------------------------ There are no active volume tasks sh-4.2# sh-4.2# cat glusterd.log|grep -i disconnec [2017-08-08 11:45:35.612285] I [MSGID: 106004] [glusterd-handler.c:5879:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.35.6> (<e74482f2-b77f-4920-817e-5181cb748cce>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-08-08 11:45:45.700948] E [socket.c:2360:socket_connect_finish] 0-management: connection to 192.168.35.6:24007 failed (Connection refused); disconnecting socket [2017-08-08 15:21:15.512361] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-08 15:21:15.513083] I [MSGID: 106005] [glusterd-handler.c:5687:__glusterd_brick_rpc_notify] 0-management: Brick 192.168.35.3:/var/lib/heketi/mounts/vg_803bb877441ad72c82b0ee6113a79511/brick_809331b5ef0e23c3327cf4483258dd25/brick has disconnected from glusterd. [2017-08-09 17:43:29.010548] I [MSGID: 106004] [glusterd-handler.c:5879:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.35.4> (<29382453-610e-40f6-b2b3-9ae4edcfde51>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-08-09 17:43:39.504984] E [socket.c:2360:socket_connect_finish] 0-management: connection to 192.168.35.4:24007 failed (Connection refused); disconnecting socket [2017-08-09 17:45:59.307446] I [MSGID: 106004] [glusterd-handler.c:5879:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.35.2> (<8dbffc32-38b5-460c-a857-cd2303a35c35>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-08-09 17:46:09.793636] E [socket.c:2360:socket_connect_finish] 0-management: connection to 192.168.35.2:24007 failed (Connection refused); disconnecting socket [2017-08-09 17:48:29.725173] I [MSGID: 106004] [glusterd-handler.c:5879:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.35.6> (<e74482f2-b77f-4920-817e-5181cb748cce>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-08-09 17:48:40.112820] E [socket.c:2360:socket_connect_finish] 0-management: connection to 192.168.35.6:24007 failed (Connection refused); disconnecting socket [2017-08-09 17:53:41.528247] E [socket.c:2360:socket_connect_finish] 0-management: connection to 192.168.35.5:24007 failed (Connection refused); disconnecting socket [2017-08-09 17:53:41.528302] I [MSGID: 106004] [glusterd-handler.c:5879:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.35.5> (<849ac512-5d53-47b3-bbe2-7afff39627e7>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-08-09 17:53:43.192552] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-09 17:53:43.193180] I [MSGID: 106005] [glusterd-handler.c:5687:__glusterd_brick_rpc_notify] 0-management: Brick 192.168.35.3:/var/lib/heketi/mounts/vg_803bb877441ad72c82b0ee6113a79511/brick_9a7f3517851a71e13bc3ff1f3b96f4bc/brick has disconnected from glusterd. [2017-08-09 17:53:43.193974] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-09 17:53:43.194563] I [MSGID: 106005] [glusterd-handler.c:5687:__glusterd_brick_rpc_notify] 0-management: Brick 192.168.35.3:/var/lib/heketi/mounts/vg_803bb877441ad72c82b0ee6113a79511/brick_809331b5ef0e23c3327cf4483258dd25/brick has disconnected from glusterd. sh-4.2# Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Verified in build - cns-deploy-5.0.0-23.el7rhgs.x86_64 When a gluster pod is restarted the bricks on the pod starts and self-heals continues without any issues. Moving the bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:2881