Bug 236276 - rgmanager fails to start any <vm ...> service
rgmanager fails to start any <vm ...> service
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager (Show other bugs)
5.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
: Reopened
: 236279 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-12 16:06 EDT by Scott Bachmann
Modified: 2009-04-16 18:37 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-04-16 23:35:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Cluster configuration using <vm ...> (1.75 KB, text/plain)
2007-04-12 16:06 EDT, Scott Bachmann
no flags Details
Output of test run using clusvcadm (2.35 KB, text/plain)
2007-04-12 17:20 EDT, Scott Bachmann
no flags Details

  None (edit)
Description Scott Bachmann 2007-04-12 16:06:17 EDT
Description of problem:
The resource group manager fails to start any "vm" service.

Version-Release number of selected component (if applicable):
rgmanager-2.0.23 / CVS

How reproducible:
Always

Steps to Reproduce:
1. Create a cluster.conf that has a <vm ...> child of a <service ...>
  
Actual results:
Resource group manager fails to start or stop the vm service.

Expected results:
Resource group manager would properly start, stop and monitor a vm service.

Additional info:
The file /usr/share/cluster/service.sh fails to list "vm" as a possible 
child.  After adding <child type="vm" ...>, the resource group manager was 
able to properly start, stop and monitor the vm service.
Comment 1 Scott Bachmann 2007-04-12 16:06:17 EDT
Created attachment 152504 [details]
Cluster configuration using <vm ...>
Comment 2 Scott Bachmann 2007-04-12 16:11:27 EDT
*** Bug 236279 has been marked as a duplicate of this bug. ***
Comment 3 Lon Hohberger 2007-04-12 16:44:14 EDT
It works for me; what do your logs look like?  It sounds like cluster.conf
didn't get updated correctly on a particular node or something.

<child> tags are not a requirement; if unspecified any unlisted resource is
started after all defined child resource types.  So, if you added a file system
to your service, the <vm> instance would be started after the file system.

What does rg_test say for your config?

[root@asuka resources]# /usr/sbin/rg_test test /etc/cluster/cluster.conf
Running in test mode.
Loaded 18 resource rules
=== Resources List ===
Resource type: service [INLINE]
Instances: 1/1
Agent: service.sh
Attributes:
  name = test [ primary unique required ]

Resource type: vm [INLINE]
Instances: 1/1
Agent: vm.sh
Attributes:
  name = foo [ primary ]

=== Resource Tree ===
service {
  name = "test";
  vm {
    name = "foo";
  }
}
[root@asuka resources]# /usr/sbin/rg_test test /etc/cluster/cluster.conf start
service test
Running in test mode.
Starting test...
# xm command line: foo restart="never"
Error: Unable to open config file: foo
... (rest of xm errors) ...
Failed to start test



As a side note <vm> instances are not meant to be encapsulated in <service>
blocks (doing so will prevent live migration; which will be fixed in the next
errata).  You can put them on the same level as services (with failover domains
/ restart policies / etc... if you want).
Comment 4 Lon Hohberger 2007-04-12 16:49:01 EDT
        <rm>
                <failoverdomains>
                        <failoverdomain name="XenSrvA" ordered="1" restricted="1">
                                <failoverdomainnode name="sys-a" priority="1"/>
                                <failoverdomainnode name="sys-b" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <service name="site-a" domain="XenSrvA" autostart="0">
                      <vm name="website_a"/>
                </service>
        </rm>

FWIW, you could have done:

        <rm>
                <failoverdomains>
                        <failoverdomain name="XenSrvA" ordered="1">
                                <failoverdomainnode name="sys-a" priority="1"/>
                                <failoverdomainnode name="sys-b" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <vm name="website_a" domain="XenSrvA" autostart="0"/>
        </rm>

However, your configuration *should* work.
Comment 5 Scott Bachmann 2007-04-12 17:19:01 EDT
The output of rg_test is similar to yours, as expected.

[root@sys-a cluster]# rg_test test /etc/cluster/cluster.conf
Running in test mode.
Loaded 17 resource rules
=== Resources List ===
Resource type: service [INLINE]
Instances: 1/1
Agent: service.sh
Attributes:
  name = site-a [ primary unique required ]
  domain = XenSrvA
  autostart = 1

Resource type: vm [INLINE]
Instances: 1/1
Agent: vm.sh
Attributes:
  name = website_a [ primary ]

=== Resource Tree ===
service {
  name = "site-a";
  domain = "XenSrvA";
  autostart = "1";
  vm {
    name = "website_a";
  }
}

And I'm able to both start and stop the service using rg_test test, as you 
showed.  However, I'm unable to control the service from clusvcadm.  Enable 
and disable return a Success.  I modified vm.sh to echo the calling argument 
(start, stop, etc) to a log file, and the log file only shows meta-data 
requests.  No start or stop.  

Comment 6 Scott Bachmann 2007-04-12 17:20:55 EDT
Created attachment 152507 [details]
Output of test run using clusvcadm
Comment 7 Lon Hohberger 2007-04-12 17:39:38 EDT
That's really strange; my logs work (despite not having an actual VM, it does
try to start/stop it as it should):

Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> Starting stopped service
service:test
Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> start on vm "foo" returned 1
(generic error)
Apr 12 16:39:34 asuka clurgmgrd[11869]: <warning> #68: Failed to start
service:test; return value: 1
Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> Stopping service service:test
Apr 12 16:39:39 asuka clurgmgrd[11869]: <notice> Service service:test is recovering
Apr 12 16:39:40 asuka clurgmgrd[11869]: <warning> #71: Relocating failed service
service:test
Apr 12 16:39:40 asuka clurgmgrd[11869]: <notice> Stopping service service:test
Apr 12 16:39:45 asuka clurgmgrd[11869]: <notice> Service service:test is stopped

I'll keep this around until I can reproduce it.

Does putting <vm> at the top level work for you?
Comment 8 Lon Hohberger 2007-04-12 17:45:12 EDT
If not, run with the configuration that works for you until we figure out why
it's not working for you.
Comment 9 Scott Bachmann 2007-04-13 09:10:49 EDT
Putting <vm> at the top level also fails.  I'll go back and reinstall the 
system from scratch, and follow up on this when I have more information.  
Since I may not be running the version you have, what's the latest suggested 
version, CVS?
Comment 10 Lon Hohberger 2007-04-13 15:32:49 EDT
I was testing on 2.0.23
Comment 11 Scott Bachmann 2007-04-16 10:15:52 EDT
After reinstalling and testing with 2.0.23, I was able to use <vm>.  I think 
that I remembered incorrectly on if I had tested it with 2.0.23 while working 
out a few issues with the quorum disk (of which I see have already been fixed 
in the CVS).  The CVS version (RHEL5 branch) still fails, so I'll need to take 
another look at this or just wait for the next set of official updates. Thanks 
for your time!
Comment 12 Nate Straz 2007-12-13 12:19:04 EST
Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.
Comment 13 Lon Hohberger 2009-02-12 14:02:24 EST
Another data point -- SELinux up to and including RHEL 5.3 can prevent rgmanager from starting VMs, even rg_test works.

We will be trying to resolve this in RHEL 5.4

Note You need to log in before you can comment on or make changes to this bug.