Bug 474297 - [RFE] (with patch) cluster network connectivity checker
Summary: [RFE] (with patch) cluster network connectivity checker
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-12-03 07:33 UTC by David Ash
Modified: 2011-01-25 19:33 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-25 19:33:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cluster network connectivity checker python script (4.03 KB, text/x-python)
2008-12-03 07:33 UTC, David Ash
no flags Details

Description David Ash 2008-12-03 07:33:18 UTC
Created attachment 325493 [details]
cluster network connectivity checker python script

Description of problem:

This is an RFE for a simple script which checks network connectivity for a basic sanity check after all application cluster services are up and going on all nodes in the cluster.

I wrote example script in python (originally wrote it in bash script but xml parsing was too ugly).


Version-Release number of selected component (if applicable):

written against and tested on various Cluster Suite for rhel5 machines


How reproducible:

After starting all appropriate cluster services run clunetstat.py


Steps to Reproduce: (on all nodes)
1. #service cman start
2. #service rgmanager start
3. #clunetstat.py
  
Actual results:


Expected results:


Additional info:

This python script has dependencies on netcat (nc).  I don't think we should add any more dependencies on this in the rpm as I don't like adding lots of dependencies if it's not necessary but leave it as a kind of loose dependency.  It will simply exit with error if netcat isn't installed so I see this as good enough.

Comment 1 Christine Caulfield 2008-12-10 13:45:36 UTC
I'm not sure how much use this is as it stands. It checks that everything is running - but if it's running then you already know that. It also assumes you are running all the subsystems, which isn't always true.

A far more useful variant would be to check the network ports BEFORE the cluster software was started. That could be used to diagnose potential problems. Which seem to be more common I think.

Also, I'm less than happy about random applications poking into sockets (eg the DLM) that it has no business poking around in. If you want to check that the port is open there are neater and less intrusive ways of doing it (see nmap for instance).

Added to which there are a lot of hard-coded things in there that can be changed. eg the port number that OpenAIS runs on, the port number DLM runs on (also the DLM can use SCTP, though we don't support it) and the location of cluster.conf.

Comment 2 David Ash 2008-12-10 21:57:40 UTC
I thought it might pick up possible issues like split brain.  I might rewrite this with nmap then.


Note You need to log in before you can comment on or make changes to this bug.