Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 517399

Summary: [RFE] - ccsd does not log what IPs are used for binding/sending packets to.
Product: Red Hat Enterprise Linux 5 Reporter: Eduardo Damato <edamato>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.3CC: cluster-maint, edamato, jcastillo, jkortus, rohara
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cman-2.0.115-27.el5.src.rpm Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 08:41:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 557292    
Attachments:
Description Flags
patch implementing socket printing.
none
new patch for eval none

Description Eduardo Damato 2009-08-13 18:39:02 UTC
Description of problem:

Currently when ccsd fails because it can not bind to an address or because it can not send a packet, it just logs the error, but does not log what socket failed.

This causes a supportability problem, because it can be very difficult to find out if the problem is IPv6, IPv4, if it is a routing problem, if it is a broadcast problem, an interface problem, or even a protocol problem.

Version-Release number of selected component (if applicable):

ccsd 2.0.98, part of 2.0.98-1.el5_3.7

How reproducible:

every time

Steps to Reproduce:
1. have a bind or sendto error
  
Actual results:

ccsd shows the error but not what socket was involved.

Expected results:

have ccsd in the future show the sockets it could not bind to or sendto.

Additional info:

Attached a proposed patch.

Comment 1 Eduardo Damato 2009-08-13 18:42:29 UTC
Created attachment 357357 [details]
patch implementing socket printing.

***Preliminary patch***

Patch implementing echo of the protocol/address socket. The patch above has been tested to produce the following output.

Aug 13 19:07:53 pe1950-1 ccsd[20974]: Starting ccsd 2.0.98:
Aug 13 19:07:53 pe1950-1 ccsd[20974]:  Built: Aug 13 2009 13:44:02
Aug 13 19:07:53 pe1950-1 ccsd[20974]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Aug 13 19:07:53 pe1950-1 ccsd[20974]: Unable to bind socket addr: 127.0.0.1, port: 22211, proto: 2: Cannot assign requested address

Aug 13 19:08:18 pe1950-1 ccsd[21172]: Starting ccsd 2.0.98:
Aug 13 19:08:18 pe1950-1 ccsd[21172]:  Built: Aug 13 2009 13:44:02
Aug 13 19:08:18 pe1950-1 ccsd[21172]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Aug 13 19:08:18 pe1950-1 ccsd[21172]: cluster.conf (cluster name = p1p1p1p1, version = 700) found.
Aug 13 19:08:18 pe1950-1 ccsd[21172]: Unable to perform sendto addr: 255.255.255.255, port: 22467, proto: 2: Network is unreachable

Aug 13 19:19:26 pe1950-1 ccsd[23429]: Starting ccsd 2.0.98:
Aug 13 19:19:26 pe1950-1 ccsd[23429]:  Built: Aug 13 2009 13:44:02
Aug 13 19:19:26 pe1950-1 ccsd[23429]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Aug 13 19:19:26 pe1950-1 ccsd[23429]: Unable to bind socket addr: ::1, port: 22211, scope: 0: Cannot assign requested address


Aug 13 19:19:53 pe1950-1 ccsd[23684]: Starting ccsd 2.0.98:
Aug 13 19:19:53 pe1950-1 ccsd[23684]:  Built: Aug 13 2009 13:44:02
Aug 13 19:19:53 pe1950-1 ccsd[23684]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Aug 13 19:19:53 pe1950-1 ccsd[23684]: Unable to add to membership: No such device
Aug 13 19:19:53 pe1950-1 ccsd[23684]: cluster.conf (cluster name = p1p1p1p1, version = 700) found.
Aug 13 19:19:53 pe1950-1 ccsd[23684]: Unable to perform sendto addr: ff02::3:1, port: 22467, scope: 0: Network is unreachable

Comment 2 Ryan O'Hara 2009-08-13 18:56:03 UTC
The patch looks good to me. The one thing I would add is a check to see if inet_ntop returns NULL, in which case addrbuf would be NULL and the printf would choke.

Comment 3 Eduardo Damato 2009-08-13 19:34:36 UTC
Created attachment 357366 [details]
new patch for eval

Comment 4 Ryan O'Hara 2009-08-14 15:37:18 UTC
That patch looks good. Note that is inet_ntop returns NULL, it should also set errno. That might be of some use, but I think this patch is fine either way.

Comment 5 Christine Caulfield 2009-08-26 07:24:51 UTC
I don't like the identical "inet_ntop: NULL pointer" errors. If you do get a NULL from inet_ntop then the error reporting is actually worse then it was before because you can't tell if the root cause was bind or sendto, or whether it was iPv4 or ipv6

Comment 7 Christine Caulfield 2009-10-22 09:14:21 UTC
Ryan: can you do an improved patch please? At the moment it potentially makes the error messages worse rather than better.

Comment 8 Christine Caulfield 2009-12-21 17:17:18 UTC
commit 70a541bd76cfb45d7c97ad47d984e5ace9dbea98
Author: Christine Caulfield <ccaulfie>
Date:   Mon Dec 21 17:13:59 2009 +0000

    ccsd: Improve error messages from ccsd
    
    Resolves rhbz#517399

Comment 11 Jaroslav Kortus 2010-03-15 18:04:31 UTC
# run nc -l 50008 and then ccsd:

Mar 15 13:00:06 z2 ccsd[12604]: Starting ccsd 2.0.115: 
Mar 15 13:00:06 z2 ccsd[12604]:  Built: Mar 10 2010 05:40:38 
Mar 15 13:00:06 z2 ccsd[12604]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved. 
Mar 15 13:00:06 z2 ccsd[12604]: Unable to bind to socket. 
[root@z2 ~]# rpm -q cman
cman-2.0.115-33.el5

The patch seems to be there, so is this yet another code path that needs improvement?

Comment 12 Lon Hohberger 2010-03-15 20:17:25 UTC
  if (bind(ccsd_fd, (struct sockaddr *)&sin, sizeof(sin)) < 0) {
    log_err("Unable to bind to socket.\n");
    close(ccsd_fd);
    exit(EXIT_FAILURE);
  }

This is cnx_mgr.c:400.  The other bind() calls are indeed fixed.

Comment 14 Jaroslav Kortus 2010-03-16 11:48:20 UTC
Last missing piece logged as new bug 573996. Marking this one as verified.

Comment 16 errata-xmlrpc 2010-03-30 08:41:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0266.html