Bug 1574298 - on glusterd initial process, brick started always a error with "EPOLLERR - disconnecting now"
Summary: on glusterd initial process, brick started always a error with "EPOLLERR - di...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
medium
urgent
Target Milestone: ---
Assignee: Vishal Pandey
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-03 03:15 UTC by George
Modified: 2019-11-07 05:29 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-09-20 09:42:26 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description George 2018-05-03 03:15:45 UTC
Description of problem:
on glusterd initial process, brick started always a error with "EPOLLERR - disconnecting now", and this error will occasionaly lead to start 2 times of glusterfsd, then will lead to glustershd can't work normal

Version-Release number of selected component (if applicable):


How reproducible:

 setup a glusterfs with 4 volume with name "log","export","service","ccs" replated with 2 SN,
then start the 4 volume, and stop the glusterd and all related process/services

Steps to Reproduce:
1. start glusterd services or run command "glusterd -p /somewhere/glusterd.pid"
2. check the log of "glusterd.log"

Actual results:
	Line 625: [2018-04-15 18:05:59.964454] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/ccs/brick
    Line 629: [2018-04-15 18:06:00.117043] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/export/brick
	Line 632: [2018-04-15 18:06:00.277399] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick
	Line 635: [2018-04-15 18:06:00.379539] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/mstate/brick
	Line 639: [2018-04-15 18:06:00.481778] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/services/brick
			  [2018-04-15 18:06:00.651825] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
	Line 647: [2018-04-15 18:06:00.652182] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/export/brick has disconnected from glusterd.
              [2018-04-15 18:06:00.652545] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
	Line 649: [2018-04-15 18:06:00.652850] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/log/brick has disconnected from glusterd.
              [2018-04-15 18:06:00.656017] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
	Line 658: [2018-04-15 18:06:00.763176] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/mstate/brick has disconnected from glusterd.
	Line 660: [2018-04-15 18:06:00.763845] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/services/brick has disconnected from glusterd.
	Line 661: [2018-04-15 18:06:00.764239] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/log/brick
	Line 664: [2018-04-15 18:06:00.866569] I [glusterd-utils.c:5928:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/mstate/brick
			  [2018-04-15 18:06:00.763558] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
	Line 670: [2018-04-15 18:06:00.980737] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/log/brick has disconnected from glusterd.
		      [2018-04-15 18:06:00.980434] I [socket.c:2478:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
	Line 677: [2018-04-15 18:06:01.088354] I [MSGID: 106005] [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management: Brick sn-0.local:/mnt/bricks/mstate/brick has disconnected from glusterd.

Expected results:
there should not have "EPOLLERR - disconnecting now" message, and brick should always disconnected from glusterd.

Additional info:

Comment 1 Atin Mukherjee 2018-07-02 02:53:45 UTC
Wouldn't https://review.gluster.org/#/c/20197/ fix this problem?

Comment 2 George 2018-07-10 01:14:41 UTC
the root cause shoud be different for this issue.
the issue not fixed by the patch with the above.

There should not have "EPOLLERR - disconnecting now" message when gluster begin start, it is a risk , which will lead to glusterfsd with brick start twice times, and finally lead to glustershd can't correct work,

Comment 3 Atin Mukherjee 2018-10-05 02:33:39 UTC
Mohit - I think you had a root cause around this problem which we saw in house in one of the setup while analyzing a problem in one of the setup having brick multiplexing configured. Could you update this bug with the root cause once you get some time?

Comment 4 Shyamsundar 2018-10-23 14:54:17 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 5 Amar Tumballi 2019-06-17 11:04:35 UTC
Any update?

Comment 7 Vishal Pandey 2019-07-22 07:51:33 UTC
On testing on latest upstream version, I don't see " 0-transport: EPOLLERR - disconnecting now " logs anymore.

Thanks,
Vishal Pandey

Comment 8 Vishal Pandey 2019-07-22 07:53:28 UTC
George, Can you try and reproduce this on the latest upstream version ?

Thanks,
Vishal Pandey

Comment 9 Vishal Pandey 2019-08-21 09:34:50 UTC
Can we make a decision on this issue ?

Comment 10 Vishal Pandey 2019-08-27 07:51:29 UTC
George, Can you try and reproduce this on the latest upstream version ?

Comment 11 Vishal Pandey 2019-09-10 13:20:39 UTC
@George Can you address the needinfo or else I will have to close the bug considering that its no more reproducible.

Comment 12 Vishal Pandey 2019-09-18 08:06:13 UTC
@George Can you address the needinfo or else I will have to close the bug considering that its no more reproducible.

Comment 13 Vishal Pandey 2019-09-20 09:42:26 UTC
As it's no more reproducible, I'm closing the bug. Please feel free to reopen the bug, if the issue persists.

Thanks,
Vishal


Note You need to log in before you can comment on or make changes to this bug.