Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
We have three Percona servers running in master/master mode. There's a floating VIP in front. This percona_vip is managed by pacemaker & corosync for HA purpose, using RA "IPaddr2", provider "heartbeat", and standard "ocf". How to let this RA to run user defined monitor script?
There is a parameter `monitor_script` in monitor operation existing in derived/old version of IPaddr2 script, which let use define the path of monitor script. However, it doesn't work well with the setting of `start-delay` parameter. In our tests, the monitor script was executed right after the resource was restarted even if we set `start-delay` of monitor operation to 400 seconds.
Version-Release number of selected component (if applicable):
pacemaker-libs-1.1.13-10.el7_2.4.x86_64
pacemaker-cluster-libs-1.1.13-10.el7_2.4.x86_64
pacemaker-cli-1.1.13-10.el7_2.4.x86_64
pacemaker-1.1.13-10.el7_2.4.x86_64
How reproducible:
easy to reproduce
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
I'm not sure there's a way currently to do what you want.
The monitor_scripts parameter, as far as I know, is only implemented for VirtualDomain resources. While it would be possible to implement something similar in any resource agent, it's probably not the best place to do that sort of thing.
Also, most resource agents execute a monitor as part of the start and stop operations. Setting a start-delay in pacemaker won't affect what the resource agent does internally.
Pacemaker does offer a generic capability similar to monitor_scripts that can work with any resource agent. I've never used it myself, but here's my understanding:
There is a resource meta-attribute "container" that was originally designed for use with nagios checks. Pacemaker supports the "nagios" class of resources (as opposed to "ocf", "systemd", etc.) to execute nagios checks. Nagios checks are essentially just a monitor; pacemaker implements start and stop basically as null operations.
When a resource has the "container" meta-attribute, the resource is started, stopped, and monitored normally, but if the monitor fails, the resource specified by "container" is recovered (rather than the resource itself). Also, the resource will be colocated with the container resource and ordered relative to it.
The intended use case was to have a resource that creates a virtual guest (such as VirtualDomain or Xen) and a nagios check for a service that runs inside that virtual guest. The nagios check would be configured with "container" set to the guest resource. The cluster would start the guest, then "start" the nagios resource. The nagios check would run as a normal recurring monitor, and if it failed, the guest resource would be recovered.
You could potentially use that capability here. You could write either a custom nagios check, or a custom OCF resource, implementing your extended monitor, and set its "container" meta-attribute to the percona_vip resource.
Due to limited development resources and the existing workaround of using the "container" meta-attribute, we will not implement anything new for this.
If you have any questions or encounter any problems when trying the "container" approach, let me know.