Bug 761742 (GLUSTER-10)

Summary: [ glusterfs 2.0.0 ] - Repeated Log messages with invalid remote-host in protocol/client
Product: [Community] GlusterFS Reporter: Gururaj K <guru>
Component: protocolAssignee: Anush Shetty <anush>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 2.0.2CC: amarts, anush, gluster-bugs, krishna, pavan, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Gururaj K 2009-06-12 10:12:49 UTC
* Observed when using the following spec file (on Milpitas cluster)

  1: volume remote1
  2:   type protocol/client
  3:   option transport-type tcp
  4:   option remote-host n1
  5:   option remote-subvolume p3
  6: end-volume


* The following logs are repeated at approximately every 10 seconds:

[2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr] remote1: DNS resolution failed on host n1
[2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr] remote1: DNS resolution failed on host n1
[2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr] remote2: DNS resolution failed on host n4
[2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr] remote2: DNS resolution failed on host n4
[2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr] remote1: DNS resolution failed on host n1
[2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr] remote1: DNS resolution failed on host n1
[2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr] remote2: DNS resolution failed on host n4
[2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.
[2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr] remote2: DNS resolution failed on host n4
[2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down. Going offline until atleast one of them comes back up.




It would be better if we could print the error logs only once.

Comment 1 Basavanagowda Kanur 2009-07-23 05:44:42 UTC
(In reply to comment #0)
> * Observed when using the following spec file (on Milpitas cluster)
> 
>   1: volume remote1
>   2:   type protocol/client
>   3:   option transport-type tcp
>   4:   option remote-host n1
>   5:   option remote-subvolume p3
>   6: end-volume
> 
> 
> * The following logs are repeated at approximately every 10 seconds:
> 
> [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote1: DNS resolution failed on host n1
> [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote1: DNS resolution failed on host n1
> [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote2: DNS resolution failed on host n4
> [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote2: DNS resolution failed on host n4
> [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote1: DNS resolution failed on host n1
> [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote1: DNS resolution failed on host n1
> [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote2: DNS resolution failed on host n4
> [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> getaddrinfo failed (Name or service not known)
> [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> remote2: DNS resolution failed on host n4
> [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> Going offline until atleast one of them comes back up.
> 
> 
> 
> 
> It would be better if we could print the error logs only once.

The log messages are printed while trying to re-connect. we can suppress the logs to make it print at one place, either at gf_resolve or af_inet_client_get_remote_sockaddr.

Comment 2 Anush Shetty 2009-08-11 03:58:16 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > * Observed when using the following spec file (on Milpitas cluster)
> > 
> >   1: volume remote1
> >   2:   type protocol/client
> >   3:   option transport-type tcp
> >   4:   option remote-host n1
> >   5:   option remote-subvolume p3
> >   6: end-volume
> > 
> > 
> > * The following logs are repeated at approximately every 10 seconds:
> > 
> > [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote1: DNS resolution failed on host n1
> > [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote1: DNS resolution failed on host n1
> > [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote2: DNS resolution failed on host n4
> > [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:52:54] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:52:54] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote2: DNS resolution failed on host n4
> > [2009-06-11 23:52:54] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote1: DNS resolution failed on host n1
> > [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote1: DNS resolution failed on host n1
> > [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote2: DNS resolution failed on host n4
> > [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > [2009-06-11 23:53:04] E [common-utils.c:102:gf_resolve_ip6] resolver:
> > getaddrinfo failed (Name or service not known)
> > [2009-06-11 23:53:04] E [name.c:242:af_inet_client_get_remote_sockaddr]
> > remote2: DNS resolution failed on host n4
> > [2009-06-11 23:53:04] E [afr.c:2223:notify] replicate: All subvolumes are down.
> > Going offline until atleast one of them comes back up.
> > 
> > 
> > 
> > 
> > It would be better if we could print the error logs only once.
> 
> The log messages are printed while trying to re-connect. we can suppress the
> logs to make it print at one place, either at gf_resolve or
> af_inet_client_get_remote_sockaddr

As of 2.0.6rc4, the message below gets repeatedly printed

[2009-08-10 23:56:54] E [common-utils.c:102:gf_resolve_ip6] resolver: getaddrinfo failed (Name or service not known)
[2009-08-10 23:56:54] E [name.c:242:af_inet_client_get_remote_sockaddr] remote1: DNS resolution failed on host localhost1

Comment 3 Pavan Vilas Sondur 2010-01-19 08:25:01 UTC
need to verify if it is already fixed in the log cleanup

Comment 4 Anush Shetty 2010-04-27 08:33:30 UTC
This is still seen with 3.0.5rc1

Comment 5 Amar Tumballi 2010-05-04 08:01:49 UTC
*** Bug 867 has been marked as a duplicate of this bug. ***

Comment 6 Anand Avati 2010-05-10 01:41:53 UTC
PATCH: http://patches.gluster.com/patch/3242 in master (Adding GF_LOG_OCCASIONALLY to prevent repeated log messages)

Comment 7 Anand Avati 2010-05-11 14:10:14 UTC
PATCH: http://patches.gluster.com/patch/3250 in release-3.0 (Adding GF_LOG_OCCASIONALLY to prevent repeated log messages)