Red Hat Bugzilla – Bug 211715
luci-0.8-20.el5 and ricci-0.8-20.el5 - cman service is not started on cluster nodes after cluster.conf is copied to nodes
Last modified: 2009-04-16 18:33:23 EDT
Description of problem:
luci-0.8-20.el5 and ricci-0.8-20.el5 - cman service is not started on cluster nodes
Version-Release number of selected component (if applicable):
luci-0.8-20.el5 and ricci-0.8-20.el5
Steps to Reproduce:
1. When a new cluster is created, the cman service is not automatically started
on the cluster's nodes - the cluster's creation is not completed
The service is not started - no error is logged in var/log/messages
The service should start.
Proposed beta2 blocker for conga.
Accepted as a Beta Blocker.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux release. Product Management has requested further review
of this request by Red Hat Engineering. This request is not yet committed for
inclusion in release.
FYI - Ryan McCabe indicated that he would have updated RPMs on Monday - where
the luci+ricci processes would generate logging information to help debug this
bz# 211715 and bz #211375 describe related, but not identical problems.
In bz# 211715, on a test cluster comprised of VMWare images, the cluster.conf
file is successfully distributed to each of the cluster's nodes and the nodes
are rebooted. After the reboot, the cman (csdd) process is not running, so the
cluster cannot be started. This behavior looks to be consistent.
bz #211375 describes a problem, where with a physical set of nodes - all on the
subnet, the nodes never receive a cluster.conf file. This behavior also looks to
We are not certified on a cluster of VMWare images...there are SO MANY things
that can go wrong with VMs...can't we re-run this test on a cluster of physical
machines? I have some machines that you can use.
On the second bug - Is there a machine I can log in to and look at this problem?
This behavior is not consistent - the HP rep used conga to create a 2 node
cluster today without incident. This problem is peculiar to the setup. Were the
RPMs for cluster suite already installed? Or were you having conga pull them
down from RHN?
On the first bug - 211715 - I tried VMWare images as a stop-gap measure only -
until I could access a set of physical machines.
On the second bug - YES - the machines are in Dean's test cluster - I'll get you
the names/passwords ASAP. The cluster RPMs were installed before the test.
I am closing the first bug (this bug) as it addresses the VMWare cluster and we
are not supported on VMWare - even on rhel4 cluster suite. I will add comments
to the other bug, 211375