Bug 156149 - switching from DLM to GULM config cause errors when trying to grab MGMT status
Summary: switching from DLM to GULM config cause errors when trying to grab MGMT status
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: redhat-config-cluster
Version: 4
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Jim Parsons
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-27 21:08 UTC by Corey Marthaler
Modified: 2009-04-16 19:51 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-24 22:11:56 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2005-04-27 21:08:41 UTC
Description of problem:
I had a running dlm cluster, went to tool and switched to GULM, and this caused
the MGMT status deamon to start causing errors:

"A problem was encountered when attempting to get
information about the nodes in the cluster. The following error
messages was received from cman_tool: Failed to connect 
to localhost:core (::ffff:127.0.0.1 40040) Connection refused
In src/gulm_tool.c322 (1.0-0.pre27) death by:
Failed to connect to server"

Version-Release number of selected component (if applicable):
-38

Comment 1 Jim Parsons 2005-05-03 19:53:49 UTC
Fixed in 0.9.48

Comment 2 Corey Marthaler 2005-05-04 16:33:35 UTC
running 0.9.48, I'm still seeing the exact same cman errors.

Comment 3 Jim Parsons 2005-05-09 21:42:03 UTC
Had a minor typo causing problems - I believe this is solved now in 0.9.51-1.0

Comment 4 Corey Marthaler 2005-05-10 16:50:02 UTC
nope, same problem in -51.

Comment 5 Corey Marthaler 2005-05-19 19:59:53 UTC
This error now shows up anytime the GUI is started on a GULM cluster, even if it
wasn't switched over from dlm. This was working in earlier versions but has
regressed. Also these messages appear in the syslog when the GUI starts.

May 19 10:47:52 morph-04 gconfd (root-7841): starting (version 2.8.1), pid 7841
user 'root'
May 19 10:47:52 morph-04 gconfd (root-7841): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration
source at position 0
May 19 10:47:52 morph-04 gconfd (root-7841): Resolved address
"xml:readwrite:/root/.gconf" to a writable configuration source at position 1
May 19 10:47:52 morph-04 gconfd (root-7841): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source
at position 2
May 19 10:47:54 morph-04 lock_gulmd_core[7065]: "Magma::7842" is logged out. fd:11


Comment 6 Jim Parsons 2005-05-19 22:48:29 UTC
This is happening because the call to gulm_tool nodelist localhost:core is
timing out intermittantly. I make this call about every 20 seconds to refresh
the node list, and now and then, it will begin returning a (-1) exit code and
printing "Command timed out" to stderr. CC'ing tilstra on this for his insight.

Comment 7 michael conrad tadpol tilstra 2005-05-20 13:04:59 UTC
Add Network2 to the verbosity, then read /var/log/messages from the gulm server
node.  It should have messages about the connection attempts.  Might have some
clues there.

Comment 8 michael conrad tadpol tilstra 2005-05-20 17:31:17 UTC
try running 'gulm_tool getstats 127.0.0.1'
DNS is being too slow, and so gulm_tool is timing out before it can resolve the
name.  Why your machines are doing dns queries for 'localhost' I don't know either.


Comment 9 Jim Parsons 2005-05-20 17:50:22 UTC
well.....i cannot rectify the tendency forthis to happen - but I *can* have the
UI deal with it when it does...and it does now; kinda elegantly I think! :-)

Check out 0.9.54-1.0

Comment 10 Corey Marthaler 2005-05-24 22:11:56 UTC
fix verified in -60.


Note You need to log in before you can comment on or make changes to this bug.