Bug 223200
Summary: | persistence of disabled Clusterservices | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Michael Hagmann <michael.hagmann> |
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> |
Status: | CLOSED WONTFIX | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | cluster-maint, grimme, hlawatschek |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-07-31 18:07:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michael Hagmann
2007-01-18 12:25:51 UTC
There's an autostart flag for this purpose; set it in the configuration: <service name="foo" autostart="0"/> The service will not autostart. Dynamic states (e.g. disabled, started, stopped, etc.) are not persistent in RHEL4 or RHEL5. Side note, it might be fine to have Conga or system-config-cluster change this on the fly if desired during a 'disable' operation. More explanation: Given that there is no coherent shared state persistence like there was on RHCS3 (e.g. using a shared disk), making persistent states "work" is not possible to do correctly across cluster outages without introducing new requirements in to the system and/or breaking backward compatibility (i.e. rolling upgrade): Ex: 3 nodes, 1 service running on node 2 Stop node 1 (node 1 thinks it's enabled) Disable service (nodes 2, 3 think it's disabled) Stop nodes 2, 3 Start nodes 1, 2 Node 1 thinks the service is enabled. Node 2 thinks the service is disabled. Unless the clocks are synchronized (which is not currently a requirement for RHCS), the correct state of the service can not be determined. Sometime in the future, the states of services will likely be stored in AIS checkpoints (also not persistent across cluster outages). With the caveat that the state will be wrong (sometimes) after cluster transitions, it is possible to implement. Ok got it, but we are quite used to persistent states of Clusterservices. As we run a lot of SAP HA Clusters on TruCluster we are in the process of migrating them to Linux and there this feature is quite important for us. Do you think there is a way to reenable the persistence or some kind of workaround. BWT: For operation processes and maintenance it is also very important to disable a service and be sure it won't come back again even after cluster restart. Regards Mike. It's possible to make it work, but there's no way to guarantee the correctness of the persistent "disabled" state across cluster transitions without introducing additional requirements (notably, time synchronization and use of a shared partition on the SAN to store the states, which is what I think TruClusters does, but I could be mistaken). Even so, it is not difficult to make it work in most cases; all nodes can just record which services are disabled and mark them disabled on startup. However, there will be cases where a disagreement between states will have a 50/50 chance of putting the service in the wrong state. Ok, thats sounds great. I think the most of them we have already in place, ntp is in place and we have also a "shared root" Cluster ( like TruCluster ). That mean when you use a place in the root Filesytem to record the state, all nodes have access to them. Are we need some additional Software ( patches or new Softwarepackages ) or is this only a config thing? thx mike This will require a configuration change. I am now implementing this feature. One design way to do this: Use callbacks in vf_init() *if and only if* cluster.conf has /cluster/rm/@state_path set * On init, read the state prior to calling rg_init. * If state == failed or disabled, switch from stopped to the correct state in init_rg (?) or somewhere nearby * All other states should be cleared (unlink file) on init. This sort of coincides with another feature I implemented awhile ago but never applied which mirrors resource tree states on-disk. Both this feature and the one noted in comment #8 could be rather destabilizing. Also, because the states are not guaranteed consistent across cluster transitions, it might be better to try and make the UI and/or clusvcadm handle this. For example, it would be trivial to add 'reconfig' support to the 'service' resource agent - thereby eliminating the need to restart the service as a result of a reconfiguration. This would then allow a user to flip the 'autostart' flag as part of a more robust 'disable' operation without affecting the service state. Reconfiguration support is going out with 5.1; you can now change the autostart flag without the service bouncing as a result. So, in order to have a state persisted across transitions, one can: - set autostart to 0 - disable the service As this currently stands, we are unlikely to put the above requested feature in to RHEL4. Porting the reconfiguration flags to RHEL4 may be possible, however. |