Bug 236276
Summary: | rgmanager fails to start any <vm ...> service | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Scott Bachmann <bachmann> | ||||||
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 5.0 | CC: | capel, cluster-maint | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-04-17 03:35:42 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Scott Bachmann
2007-04-12 20:06:17 UTC
Created attachment 152504 [details]
Cluster configuration using <vm ...>
*** Bug 236279 has been marked as a duplicate of this bug. *** It works for me; what do your logs look like? It sounds like cluster.conf didn't get updated correctly on a particular node or something. <child> tags are not a requirement; if unspecified any unlisted resource is started after all defined child resource types. So, if you added a file system to your service, the <vm> instance would be started after the file system. What does rg_test say for your config? [root@asuka resources]# /usr/sbin/rg_test test /etc/cluster/cluster.conf Running in test mode. Loaded 18 resource rules === Resources List === Resource type: service [INLINE] Instances: 1/1 Agent: service.sh Attributes: name = test [ primary unique required ] Resource type: vm [INLINE] Instances: 1/1 Agent: vm.sh Attributes: name = foo [ primary ] === Resource Tree === service { name = "test"; vm { name = "foo"; } } [root@asuka resources]# /usr/sbin/rg_test test /etc/cluster/cluster.conf start service test Running in test mode. Starting test... # xm command line: foo restart="never" Error: Unable to open config file: foo ... (rest of xm errors) ... Failed to start test As a side note <vm> instances are not meant to be encapsulated in <service> blocks (doing so will prevent live migration; which will be fixed in the next errata). You can put them on the same level as services (with failover domains / restart policies / etc... if you want). <rm> <failoverdomains> <failoverdomain name="XenSrvA" ordered="1" restricted="1"> <failoverdomainnode name="sys-a" priority="1"/> <failoverdomainnode name="sys-b" priority="2"/> </failoverdomain> </failoverdomains> <service name="site-a" domain="XenSrvA" autostart="0"> <vm name="website_a"/> </service> </rm> FWIW, you could have done: <rm> <failoverdomains> <failoverdomain name="XenSrvA" ordered="1"> <failoverdomainnode name="sys-a" priority="1"/> <failoverdomainnode name="sys-b" priority="2"/> </failoverdomain> </failoverdomains> <vm name="website_a" domain="XenSrvA" autostart="0"/> </rm> However, your configuration *should* work. The output of rg_test is similar to yours, as expected. [root@sys-a cluster]# rg_test test /etc/cluster/cluster.conf Running in test mode. Loaded 17 resource rules === Resources List === Resource type: service [INLINE] Instances: 1/1 Agent: service.sh Attributes: name = site-a [ primary unique required ] domain = XenSrvA autostart = 1 Resource type: vm [INLINE] Instances: 1/1 Agent: vm.sh Attributes: name = website_a [ primary ] === Resource Tree === service { name = "site-a"; domain = "XenSrvA"; autostart = "1"; vm { name = "website_a"; } } And I'm able to both start and stop the service using rg_test test, as you showed. However, I'm unable to control the service from clusvcadm. Enable and disable return a Success. I modified vm.sh to echo the calling argument (start, stop, etc) to a log file, and the log file only shows meta-data requests. No start or stop. Created attachment 152507 [details]
Output of test run using clusvcadm
That's really strange; my logs work (despite not having an actual VM, it does try to start/stop it as it should): Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> Starting stopped service service:test Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> start on vm "foo" returned 1 (generic error) Apr 12 16:39:34 asuka clurgmgrd[11869]: <warning> #68: Failed to start service:test; return value: 1 Apr 12 16:39:34 asuka clurgmgrd[11869]: <notice> Stopping service service:test Apr 12 16:39:39 asuka clurgmgrd[11869]: <notice> Service service:test is recovering Apr 12 16:39:40 asuka clurgmgrd[11869]: <warning> #71: Relocating failed service service:test Apr 12 16:39:40 asuka clurgmgrd[11869]: <notice> Stopping service service:test Apr 12 16:39:45 asuka clurgmgrd[11869]: <notice> Service service:test is stopped I'll keep this around until I can reproduce it. Does putting <vm> at the top level work for you? If not, run with the configuration that works for you until we figure out why it's not working for you. Putting <vm> at the top level also fails. I'll go back and reinstall the system from scratch, and follow up on this when I have more information. Since I may not be running the version you have, what's the latest suggested version, CVS? I was testing on 2.0.23 After reinstalling and testing with 2.0.23, I was able to use <vm>. I think that I remembered incorrectly on if I had tested it with 2.0.23 while working out a few issues with the quorum disk (of which I see have already been fixed in the CVS). The CVS version (RHEL5 branch) still fails, so I'll need to take another look at this or just wait for the next set of official updates. Thanks for your time! Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed. Another data point -- SELinux up to and including RHEL 5.3 can prevent rgmanager from starting VMs, even rg_test works. We will be trying to resolve this in RHEL 5.4 |