Bug 1427958

Summary: [GSS] Error 0-socket.management: socket_poller XX.XX.XX.XX:YYY failed (Input/output error) during any volume operation
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Abhishek Kumar <abhishku>
Component: coreAssignee: Mohit Agrawal <moagrawa>
Status: CLOSED ERRATA QA Contact: Bala Konda Reddy M <bmekala>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: abhishku, amukherj, asrivast, bkunal, moagrawa, rcyriac, rhinduja, rhs-bugs, rnalakka, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: ssl
Fixed In Version: glusterfs-3.8.4-26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1450559 (view as bug list) Environment:
Last Closed: 2017-09-21 04:33:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1450559    
Bug Blocks: 1417145    

Description Abhishek Kumar 2017-03-01 15:19:41 UTC
Description of problem:

Error 0-socket.management: socket_poller XX.XX.XX.XX:YYY failed (Input/output error) during any volume operation.

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-14.el7rhgs.x86_64.rpm

How reproducible:
Every time in customer environment

Steps to Reproduce:
1. Create 100 volumes with ssl enabled on each volume 
2. Do any volume operation afterwards like start/stop of a volume
3.

Actual results:
Gives below error on the nodes
Error 0-socket.management: socket_poller XX.XX.XX.XX:YYY failed (Input/output error)
Expected results:
Volume operation should run smoothly without any error in logs

Additional info:

Comment 20 Abhishek Kumar 2017-04-18 06:15:16 UTC
Hello Mohit,

Thanks for sharing the patch. 

I have shared the same with cu on 3.8.4-14 build but they have reported some other issues like below.


>>>>>>>>>>>>>>>>>>
[2017-04-12 08:01:40.971933] W [MSGID: 103071] [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
[2017-04-12 08:01:40.971967] W [MSGID: 103055] [rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device
[2017-04-12 08:01:40.971974] W [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
[2017-04-12 08:01:40.972060] W [rpcsvc.c:1617:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2017-04-12 08:01:40.972074] E [MSGID: 106243] [glusterd.c:1653:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2017-04-12 08:02:55.639170] W [dict.c:1243:dict_foreach_match] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x5e84d) [0x7fa4ba16684d] -->/lib64/libglusterfs.so.0(dict_foreach+0x18) [0x7fa4c59ae7b8] -->/lib64/libglusterfs.so.0(dict_foreach_match+0xe3) [0x7fa4c59ae683] ) 0-dict: dict|match|action is NULL [Invalid argument]


>>>>>>>>>>>>>>>>>>

I tried it with my test labs also. I also got some listeners error reported in my labs. Along with that I have also faced some core generation.

It would be helpful if you look on the cores and error logs which I have shared with you.


Regards,
Abhishek Kumar

Comment 21 Atin Mukherjee 2017-04-18 07:18:04 UTC
(In reply to Abhishek Kumar from comment #20)
> Hello Mohit,
> 
> Thanks for sharing the patch. 
> 
> I have shared the same with cu on 3.8.4-14 build but they have reported some
> other issues like below.
> 
> 
> >>>>>>>>>>>>>>>>>>
> [2017-04-12 08:01:40.971933] W [MSGID: 103071]
> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
> channel creation failed [No such device]
> [2017-04-12 08:01:40.971967] W [MSGID: 103055] [rdma.c:4897:init]
> 0-rdma.management: Failed to initialize IB Device
> [2017-04-12 08:01:40.971974] W [rpc-transport.c:350:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2017-04-12 08:01:40.972060] W [rpcsvc.c:1617:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2017-04-12 08:01:40.972074] E [MSGID: 106243] [glusterd.c:1653:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2017-04-12 08:02:55.639170] W [dict.c:1243:dict_foreach_match]
> (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x5e84d)
> [0x7fa4ba16684d] -->/lib64/libglusterfs.so.0(dict_foreach+0x18)
> [0x7fa4c59ae7b8] -->/lib64/libglusterfs.so.0(dict_foreach_match+0xe3)
> [0x7fa4c59ae683] ) 0-dict: dict|match|action is NULL [Invalid argument]

I don't see any functionality impact with these logs, is there any? 

> 
> 
> >>>>>>>>>>>>>>>>>>
> 
> I tried it with my test labs also. I also got some listeners error reported
> in my labs. Along with that I have also faced some core generation.
> 
> It would be helpful if you look on the cores and error logs which I have
> shared with you.
> 
> 
> Regards,
> Abhishek Kumar

Comment 24 Mohit Agrawal 2017-05-13 04:44:06 UTC
Patch has posted on upstream https://review.gluster.org/#/c/17280/1.

Comment 25 Atin Mukherjee 2017-05-18 15:21:05 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/106256/

Comment 32 Alok 2017-07-12 08:33:31 UTC
Approved for an accelerated fix. Fix to be carried in the next downstream release to avoid regression.

Comment 38 Bala Konda Reddy M 2017-07-25 10:30:42 UTC
BUILD : 3.8.4-35


Followed the steps mentioned in the description.
No glusterd and glustershd crash is seen and The SSL related error messages mentioned in bug are not seen. Hence marking it as verified in 3.3.0.

Moving to verified

Comment 41 errata-xmlrpc 2017-09-21 04:33:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 42 errata-xmlrpc 2017-09-21 04:57:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774