Bug 474297 - [RFE] (with patch) cluster network connectivity checker
[RFE] (with patch) cluster network connectivity checker
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
All Linux
low Severity medium
: rc
: ---
Assigned To: Christine Caulfield
Cluster QE
Depends On:
  Show dependency treegraph
Reported: 2008-12-03 02:33 EST by David Ash
Modified: 2011-01-25 14:33 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2011-01-25 14:33:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
cluster network connectivity checker python script (4.03 KB, text/x-python)
2008-12-03 02:33 EST, David Ash
no flags Details

  None (edit)
Description David Ash 2008-12-03 02:33:18 EST
Created attachment 325493 [details]
cluster network connectivity checker python script

Description of problem:

This is an RFE for a simple script which checks network connectivity for a basic sanity check after all application cluster services are up and going on all nodes in the cluster.

I wrote example script in python (originally wrote it in bash script but xml parsing was too ugly).

Version-Release number of selected component (if applicable):

written against and tested on various Cluster Suite for rhel5 machines

How reproducible:

After starting all appropriate cluster services run clunetstat.py

Steps to Reproduce: (on all nodes)
1. #service cman start
2. #service rgmanager start
3. #clunetstat.py
Actual results:

Expected results:

Additional info:

This python script has dependencies on netcat (nc).  I don't think we should add any more dependencies on this in the rpm as I don't like adding lots of dependencies if it's not necessary but leave it as a kind of loose dependency.  It will simply exit with error if netcat isn't installed so I see this as good enough.
Comment 1 Christine Caulfield 2008-12-10 08:45:36 EST
I'm not sure how much use this is as it stands. It checks that everything is running - but if it's running then you already know that. It also assumes you are running all the subsystems, which isn't always true.

A far more useful variant would be to check the network ports BEFORE the cluster software was started. That could be used to diagnose potential problems. Which seem to be more common I think.

Also, I'm less than happy about random applications poking into sockets (eg the DLM) that it has no business poking around in. If you want to check that the port is open there are neater and less intrusive ways of doing it (see nmap for instance).

Added to which there are a lot of hard-coded things in there that can be changed. eg the port number that OpenAIS runs on, the port number DLM runs on (also the DLM can use SCTP, though we don't support it) and the location of cluster.conf.
Comment 2 David Ash 2008-12-10 16:57:40 EST
I thought it might pick up possible issues like split brain.  I might rewrite this with nmap then.

Note You need to log in before you can comment on or make changes to this bug.