Created attachment 1431320 [details] glusterd log Description of problem: After restarting the 'glusterfsd' and 'glusterd' services, 80% of the time, the brick will not come back online. After running 'glusterd --debug' the brick service starts correctly. Log files attached. I'm noticing these lines, but am unsure on the cause: [2018-05-04 12:44:01.557104] I [MSGID: 106143] [glusterd-pmap.c:295:pmap_registry_bind] 0-pmap: adding brick /mnt/h1a/data on port 49152 [2018-05-04 12:44:01.558314] I [MSGID: 106144] [glusterd-pmap.c:396:pmap_registry_remove] 0-pmap: removing brick (null) on port 49152 [2018-05-04 12:44:01.558385] W [socket.c:593:__socket_rwv] 0-management: readv on /var/run/gluster/ca39474c70ff91cb2cde5cc5d5551022.socket failed (No data available) OS: Fedora Server 27 Gluster version: 3.12.9 Two node setup, issue also occurs in a single node cluster. Version-Release number of selected component (if applicable): 3.12.9 How reproducible: Intermittent / 80% Steps to Reproduce: 1. Create volume 2. restart glusterfsd and glusterd services 3. Repeat step 2 until issue occurs Actual results: Brick stays offline Expected results: Brick should be online Additional info: This occurs after reboot too
We'd need following information to be captured to start debugging this problem: 1. output of gluster v get all all - wanted to understand if brick multiplexing is turned on or not. 2. glusterd and brick log file (for the bricks which is shown N/A) from the node where the brick is hosted.
Since we haven't received any further input on the information asked for at comment 1, I'm closing this bug as with the current set of information we can't debug this issue. Please feel free to reopen this once you have a reproducer and provide us all the relevant information asked.