+++ This bug was initially created as a clone of Bug #1752330 +++ +++ This bug was initially created as a clone of Bug #1732443 +++ Description of problem: ======================= I am seeing below error message in shd.log when I start a new volume(or may be as part of shd graph regeneration event). The message is not so helpful for an admin to understand what it means and hence can be a bit annoying [2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event] 0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument] ########### [2019-07-23 12:09:53.154339] I [rpc-clnt.c:2035:rpc_clnt_reconfig] 0-distrep-client-7: changing port to 49153 (from 0) [2019-07-23 12:09:53.154388] I [socket.c:871:__socket_shutdown] 0-distrep-client-7: intentional socket shutdown(14) [2019-07-23 12:09:53.154870] I [rpc-clnt.c:2035:rpc_clnt_reconfig] 0-shd1-client-2: changing port to 49154 (from 0) [2019-07-23 12:09:53.155279] I [socket.c:871:__socket_shutdown] 0-shd1-client-2: intentional socket shutdown(24) [2019-07-23 12:09:53.160810] I [rpc-clnt.c:2035:rpc_clnt_reconfig] 0-shd2-client-0: changing port to 49155 (from 0) [2019-07-23 12:09:53.160885] I [socket.c:871:__socket_shutdown] 0-shd2-client-0: intentional socket shutdown(16) [2019-07-23 12:09:53.161088] I [rpc-clnt.c:2035:rpc_clnt_reconfig] 0-shd2-client-1: changing port to 49155 (from 0) [2019-07-23 12:09:53.161208] I [socket.c:871:__socket_shutdown] 0-shd2-client-1: intentional socket shutdown(26) [2019-07-23 12:09:53.165383] I [rpc-clnt.c:2035:rpc_clnt_reconfig] 0-shd2-client-2: changing port to 49155 (from 0) [2019-07-23 12:09:53.165715] I [socket.c:871:__socket_shutdown] 0-shd2-client-2: intentional socket shutdown(22) [2019-07-23 12:09:53.168402] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-distrep-client-6: Connected to distrep-client-6, attached to remote volume '/gluster/brick2/distrep-sv3'. [2019-07-23 12:09:53.168455] I [MSGID: 108002] [afr-common.c:5589:afr_notify] 0-distrep-replicate-2: Client-quorum is met [2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event] 0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument] [2019-07-23 12:09:53.172515] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-shd1-client-1: Connected to shd1-client-1, attached to remote volume '/gluster/brick1/shd1'. [2019-07-23 12:09:53.172559] I [MSGID: 108002] [afr-common.c:5589:afr_notify] 0-shd1-replicate-0: Client-quorum is met [2019-07-23 12:09:53.173251] E [MSGID: 101172] [events.c:89:_gf_event] 0-shd1-replicate-0: inet_pton failed with return code 0 [Invalid argument] [2019-07-23 12:09:53.173591] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-distrep-client-7: Connected to distrep-client-7, attached to remote volume '/gluster/brick2/distrep-sv3'. [2019-07-23 12:09:53.175599] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-shd1-client-2: Connected to shd1-client-2, attached to remote volume '/gluster/brick1/shd1'. [2019-07-23 12:09:53.179349] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-1: Connected to shd2-client-1, attached to remote volume '/gluster/brick1/shd2'. [2019-07-23 12:09:53.179391] I [MSGID: 108005] [afr-common.c:5235:__afr_handle_child_up_event] 0-shd2-replicate-0: Subvolume 'shd2-client-1' came back up; going online. [2019-07-23 12:09:53.179656] E [MSGID: 101172] [events.c:89:_gf_event] 0-shd2-replicate-0: inet_pton failed with return code 0 [Invalid argument] [2019-07-23 12:09:53.180373] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-0: Connected to shd2-client-0, attached to remote volume '/gluster/brick1/shd2'. [2019-07-23 12:09:53.180422] I [MSGID: 108002] [afr-common.c:5589:afr_notify] 0-shd2-replicate-0: Client-quorum is met [2019-07-23 12:09:53.180698] E [MSGID: 101172] [events.c:89:_gf_event] 0-shd2-replicate-0: inet_pton failed with return code 0 [Invalid argument] [2019-07-23 12:09:53.181257] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-2: Connected to shd2-client-2, attached to remote volume '/gluster/brick1/shd2'. Version-Release number of selected component (if applicable): ============== 6.0.8 (shd-mux is reverted) How reproducible: ================ always Steps to Reproduce: 1.create a 2x3 volume and start it 2. create a new 1x3 volume and start it Actual results: ============= glustershd.log has above error message logged when we start a new volume Expected results: =================== either the message can be removed if it is not of much value, or else it can be rephrased to something more meaningful Additional info: ============== not attaching logs as it is straightforward to reproduce --- Additional comment from Vivek Das on 2019-09-10 09:47:33 UTC --- This is marked blocker based on 'Regression' keyword. I am seeing the error messages in samba client logs as well. We have not seen functional impact as of now. But would like to know what introduced this in rhgs-3.5.0. An error message is what we need to avoid to be seen by customers. Sunil your take on this please. --- Additional comment from Atin Mukherjee on 2019-09-12 02:33:11 UTC --- From _gf_event () : /* Socket Configurations */ server.sin_family = AF_INET; server.sin_port = htons(EVENT_PORT); ret = inet_pton(server.sin_family, host, &server.sin_addr); if (ret <= 0) { gf_msg(this->name, GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG, "inet_pton failed with return code %d", ret); goto out; } RETURN VALUE inet_pton() returns 1 on success (network address was successfully converted). 0 is returned if src does not contain a character string representing a valid network address in the specified address family. If af does not contain a valid address family, -1 is returned and errno is set to EAFNOSUPPORT. [2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event] 0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument] So based on the above log it looks like in certain cases host is NULL and that's why we're bailing out. This seems like a wrong functionality? Aravinda - could you take a look at this? --- Additional comment from Aravinda VK on 2019-09-12 05:47:06 UTC --- (In reply to Atin Mukherjee from comment #5) > From _gf_event () : > > /* Socket Configurations */ > > server.sin_family = AF_INET; > > server.sin_port = htons(EVENT_PORT); > > ret = inet_pton(server.sin_family, host, &server.sin_addr); > > if (ret <= 0) { > > gf_msg(this->name, GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG, > > "inet_pton failed with return code %d", ret); > > goto out; > > } > > > RETURN VALUE > inet_pton() returns 1 on success (network address was successfully > converted). 0 is returned if src does not contain a character string > representing a valid network address in the specified address family. > If af does not contain a valid address family, -1 is returned and > errno is set to EAFNOSUPPORT. > > [2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event] > 0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument] > > So based on the above log it looks like in certain cases host is NULL and > that's why we're bailing out. This seems like a wrong functionality? > > Aravinda - could you take a look at this? I will look into this issue today. --- Additional comment from Aravinda VK on 2019-09-12 07:26:20 UTC --- `-s localhost` passed in the argument is converted to `::1`. But `server.sin_family = AF_INET;` is used instead of `AF_INET6`. Due to this `inet_pton` is failing. Example `ps ax | grep shd` 13449 ? Ssl 0:00 /usr/sbin/glusterfs -s localhost --volfile-id shd/gv1 -p /var/run/gluster/shd/gv1/gv1-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/5d6ae69815eaaafd.socket --xlator-option *replicate*.node-uuid=855ed18b-5c6f-4889-b523-5b7864d2a842 --process-name glustershd --client-pid=-6 I will explore and update the proper fix to handle this issue. --- Additional comment from Atin Mukherjee on 2019-09-16 02:29:07 UTC --- (In reply to Aravinda VK from comment #7) > `-s localhost` passed in the argument is converted to `::1`. But > `server.sin_family = AF_INET;` is used instead of `AF_INET6`. Due to this > `inet_pton` is failing. > > Example `ps ax | grep shd` > > 13449 ? Ssl 0:00 /usr/sbin/glusterfs -s localhost --volfile-id > shd/gv1 -p /var/run/gluster/shd/gv1/gv1-shd.pid -l > /var/log/glusterfs/glustershd.log -S > /var/run/gluster/5d6ae69815eaaafd.socket --xlator-option > *replicate*.node-uuid=855ed18b-5c6f-4889-b523-5b7864d2a842 --process-name > glustershd --client-pid=-6 > > I will explore and update the proper fix to handle this issue. Aravinda - IIUC, the impact would be that we'd miss to generate event in such cases given gf_event API is bailing out, correct? And if so then this should be a blocker?
REVIEW: https://review.gluster.org/24180 (eventsapi: Set IPv4/IPv6 family based on input IP) posted (#1) for review on release-7 by Xavi Hernandez
This bug is moved to https://github.com/gluster/glusterfs/issues/1031, and will be tracked there from now on. Visit GitHub issues URL for further details
REVIEW: https://review.gluster.org/24180 (eventsapi: Set IPv4/IPv6 family based on input IP) merged (#2) on release-7 by Rinku Kothiya