Bug 2121852
| Summary: | During a rolling upgrade, monitor operations are not being communicated between nodes as expected. | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Chad Newsom <cnewsom> | |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | urgent | Docs Contact: | Steven J. Levine <slevine> | |
| Priority: | urgent | |||
| Version: | 8.6 | CC: | cfeist, cluster-maint, jobaker, kgaillot, mjuricek, msmazova, nwahl, sbradley, slevine | |
| Target Milestone: | rc | Keywords: | Regression, Triaged, ZStream | |
| Target Release: | 8.8 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-2.1.5-1.el8 | Doc Type: | Bug Fix | |
| Doc Text: |
.OCF resource agent metadata actions can now call `crm_node` without causing unexpected fencing
As of RHEL 8.5, OCF resource agent metadata actions blocked the controller and `crm_node` queries performed controller requests. As a result, if an agent's metadata action called `crm_node`, it blocked the controller for 30 seconds until the action timed out. This could cause other actions to fail and the node to be fenced.
With this fix, the controller now performs metadata actions asynchronously. An OCF resource agent metadata action can now call `crm_node` without issue.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2125344 2125587 2125588 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-16 08:35:22 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | 2.1.5 | |
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2125344, 2125587, 2125588 | |||
|
Description
Chad Newsom
2022-08-26 20:39:40 UTC
There are two issues being reported here: 1. The probe operations are initiated in parallel by the DC (running pacemaker-2.1.0-8.el8), but they appear to be executed in a roughly serial manner on node A (looks like two at a time). 2. The metadata actions are timing out, which should not happen. More of the relevant logs are below. Each failed metadata call times out after the hard-coded 30-second timeout. I omitted the "timed out after 30000 ms" messages below, but they're in pacemaker.log ``` # # # Node B (DC, pacemaker-2.1.0) # # # # The DC (node B) initiates probes for all resources on node A Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation nodeA-ilo_monitor_0 on nodeA | action 28 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation nodeB-ilo_monitor_0 on nodeA | action 29 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation srdf_pruledb_PRD_monitor_0 on nodeA | action 30 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation srdf_pruledbarc_PRD_monitor_0 on nodeA | action 31 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation srdf_pruledbredo_PRD_monitor_0 on nodeA | action 32 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation srdf_pruledbfra_PRD_monitor_0 on nodeA | action 33 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation srdf_pruledbgg_PRD_monitor_0 on nodeA | action 34 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation vg_pruledbredo_PRD_monitor_0 on nodeA | action 35 Aug 26 14:30:35 nodeB pacemaker-controld [50166] (te_rsc_command) notice: Initiating monitor operation vg_pruledbarc_PRD_monitor_0 on nodeA | action 36 ... and so on ... # We quickly start seeing lost actions. This means node B hasn't received any result (success or failure) from node A # within (monitor timeout + 60-second cluster delay). Aug 26 14:31:55 nodeB pacemaker-controld [50166] (action_timer_callback) error: Node nodeA did not send monitor result (via controller) within 80000ms (action timeout plus cluster-delay) Aug 26 14:31:55 nodeB pacemaker-controld [50166] (log_synapse_action) error: [Action 30]: In-flight resource op srdf_pruledb_PRD_monitor_0 on nodeA (priority: 0, waiting: (null)) Aug 26 14:31:55 nodeB pacemaker-controld [50166] (abort_transition_graph) notice: Transition 6 aborted: Action lost | source=action_timer_callback:690 complete=false Aug 26 14:31:55 nodeB pacemaker-controld [50166] (controld_record_action_timeout) warning: rsc_op 30: srdf_pruledb_PRD_monitor_0 on nodeA timed out # The problem report refers to this one: Aug 26 14:32:35 nodeB pacemaker-controld [50166] (action_timer_callback) error: Node nodeA did not send monitor result (via controller) within 120000ms (action timeout plus cluster-delay) Aug 26 14:32:35 nodeB pacemaker-controld [50166] (log_synapse_action) error: [Action 55]: In-flight resource op lsnr_pruledbprd_monitor_0 on nodeA (priority: 0, waiting: (null)) Aug 26 14:32:35 nodeB pacemaker-controld [50166] (abort_transition_graph) info: Transition 6 aborted: Action lost | source=action_timer_callback:690 complete=false Aug 26 14:32:35 nodeB pacemaker-controld [50166] (controld_record_action_timeout) warning: rsc_op 55: lsnr_pruledbprd_monitor_0 on nodeA timed out # # # Node A (pacemaker-2.1.2) # # # # probe running for srdf_pruledb_PRD Aug 26 14:30:35 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for srdf_pruledb_PRD on nodeA | transition_key=30:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=srdf_pruledb_PRD_monitor_0 Aug 26 14:31:05 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledb_PRD (ocf:sky:emc_srdf): Input/output error # probes running for srdf_pruledb_PRD and srdf_pruledbarc_PRD Aug 26 14:31:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for srdf_pruledbarc_PRD on nodeA | transition_key=31:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=srdf_pruledbarc_PRD_monitor_0 Aug 26 14:31:35 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbarc_PRD (ocf:sky:emc_srdf): Input/output error Aug 26 14:32:05 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledb_PRD (ocf:sky:emc_srdf): Input/output error Aug 26 14:32:35 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbarc_PRD (ocf:sky:emc_srdf): Input/output error # It seems we give up on srdf_pruledb_PRD and srdf_pruledbarc_PRD; we never see them again # probe running for srdf_pruledbredo_PRD Aug 26 14:32:35 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for srdf_pruledbredo_PRD on nodeA | transition_key=32:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=srdf_pruledbredo_PRD_monitor_0 Aug 26 14:33:05 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbredo_PRD (ocf:sky:emc_srdf): Input/output error # probes running for srdf_pruledbredo_PRD and srdf_pruledbfra_PRD Aug 26 14:33:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for srdf_pruledbfra_PRD on nodeA | transition_key=33:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=srdf_pruledbfra_PRD_monitor_0 Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Validate check of symclipath value /opt/emc/SYMCLI/bin ok Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Validate check of VG pruledbredo_PRD ok Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: vg_sysid = nodeB Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Volume group pruledbredo_PRD not active on this host Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Not running on this host Aug 26 14:33:35 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbfra_PRD (ocf:sky:emc_srdf): Input/output error Aug 26 14:34:05 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbredo_PRD (ocf:sky:emc_srdf): Input/output error Aug 26 14:34:35 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbfra_PRD (ocf:sky:emc_srdf): Input/output error # It seems we give up on srdf_pruledbredo_PRD and srdf_pruledbfra_PRD; we never see them again # probe running for srdf_pruledbgg_PRD Aug 26 14:34:35 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for srdf_pruledbgg_PRD on nodeA | transition_key=34:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=srdf_pruledbgg_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (controld_get_rsc_metadata) warning: Failed to get metadata for srdf_pruledbgg_PRD (ocf:sky:emc_srdf): Input/output error Aug 26 14:35:05 emc_srdf(srdf_pruledbgg_PRD)[48951]: INFO: Validate check of symclipath value /opt/emc/SYMCLI/bin ok Aug 26 14:35:05 emc_srdf(srdf_pruledbgg_PRD)[48951]: INFO: Validate check of VG pruledbgg_PRD ok Aug 26 14:35:05 emc_srdf(srdf_pruledbgg_PRD)[48951]: INFO: vg_sysid = nodeA Aug 26 14:35:05 emc_srdf(srdf_pruledbgg_PRD)[48951]: INFO: Volume group pruledbgg_PRD not active on this host Aug 26 14:35:05 emc_srdf(srdf_pruledbgg_PRD)[48951]: INFO: Not running on this host # probes running for srdf_pruledbgg_PRD and vg_pruledbredo_PRD Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for vg_pruledbredo_PRD on nodeA | transition_key=35:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=vg_pruledbredo_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (wait_for_sync_result) info: LVM-activate_meta-data_0[48952] exited with status 0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for vg_pruledbarc_PRD on nodeA | transition_key=36:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=vg_pruledbarc_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for vg_pruledb_PRD on nodeA | transition_key=37:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=vg_pruledb_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for vg_pruledbgg_PRD on nodeA | transition_key=38:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=vg_pruledbgg_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for vg_pruledbfra_PRD on nodeA | transition_key=39:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=vg_pruledbfra_PRD_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for fs_pruledbprd on nodeA | transition_key=40:6:7:57303c79-0b90-48d2-b386-87a9f518edd3 op_key=fs_pruledbprd_monitor_0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (wait_for_sync_result) info: Filesystem_meta-data_0[48983] exited with status 0 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (log_executor_event) notice: Result of probe operation for vg_pruledbredo_PRD on nodeA: not running | CIB update 27, graph action confirmed; call=33 key=vg_pruledbredo_PRD_monitor_0 rc=7 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (log_executor_event) notice: Result of probe operation for vg_pruledbarc_PRD on nodeA: not running | CIB update 28, graph action confirmed; call=37 key=vg_pruledbarc_PRD_monitor_0 rc=7 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (log_executor_event) notice: Result of probe operation for vg_pruledb_PRD on nodeA: not running | CIB update 29, graph action confirmed; call=41 key=vg_pruledb_PRD_monitor_0 rc=7 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (log_executor_event) notice: Result of probe operation for vg_pruledbgg_PRD on nodeA: not running | CIB update 30, graph action confirmed; call=45 key=vg_pruledbgg_PRD_monitor_0 rc=7 Aug 26 14:35:05 nodeA pacemaker-controld [47747] (log_executor_event) notice: Result of probe operation for vg_pruledbfra_PRD on nodeA: not running | CIB update 31, graph action confirmed; call=49 key=vg_pruledbfra_PRD_monitor_0 rc=7 # LVM and Filesystem probes are finishing quickly and may be running in parallel of more than 2 at a time... # ... maybe metadata actions are throttled but the probes themselves aren't? Not sure # logs continue in a similar fashion ``` ----- My key takeaways: 1. Something does seem to be serialized by the executor on node A. I don't know whether the probes are inherently serialized, or whether they're forced to bottleneck if two of them have a pending metadata action. I'm skeptical that they're inherently serialized, since a bunch of LVM and Filesystem probes appeared to run simultaneously. Regardless, the DC (node B) initiated them all at once and started their timers simultaneously. This might be related to BZ 1858462, and it should be investigated further. We can keep this BZ open for now to see if it's the same thing or something different. 2. These are custom resource agents, and it now seems obvious that the metadata actions are doing things that they're not supposed to. Note messages like the following: Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Validate check of symclipath value /opt/emc/SYMCLI/bin ok Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Validate check of VG pruledbredo_PRD ok Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: vg_sysid = nodeB Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Volume group pruledbredo_PRD not active on this host Aug 26 14:33:05 emc_srdf(srdf_pruledbredo_PRD)[48554]: INFO: Not running on this host That suggests a bug in the resource agent. A resource agent's metadata action shouldn't perform validation. It especially shouldn't perform host-specific validation (like checking for a volume group being active); such host- specific validation checks should be reserved for the validate action with OCF_CHECK_LEVEL=10. I only showed logs from ocf:sky:emc_srdf, but the ocf:sky:CrossSiteIP resource agent shows similar behavior. Further reading: - https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#mandatory-actions (for what the metadata action should do) - https://github.com/ClusterLabs/OCF-spec/blob/main/ra/1.1/resource-agent-api.md#check-levels (for the validate action with OCF_CHECK_LEVEL=10) Follow-up: We need to try to reproduce this and establish whether this is a change between 2.1.0 and 2.1.2, or an issue that was already present and happened to appear during the rolling upgrade due to the metadata action taking so long. The actions aren't being serialized; meta-data actions are always synchronous, which blocks anything else from being done. Also, meta-data actions consume the timeout of the probe they're being run for. (Bug 1850506 would fix both of these issues.) Meta-data actions calling validation is suboptimal but relatively common. It generally indicates the agent calls validation first, before even looking at the action. A more important question is why are the meta-data actions timing out; they aren't logging for more than a few seconds, but it appears the agent isn't exiting. I'm guessing these are just the system logs? The pacemaker.log might have more detail about why the actions aren't completing. (In reply to Ken Gaillot from comment #3) > The actions aren't being serialized; meta-data actions are always > synchronous, which blocks anything else from being done. Also, meta-data > actions consume the timeout of the probe they're being run for. (Bug 1850506 > would fix both of these issues.) Right, I saw that meta-data actions are always synchronous and figured they would consume the probe timeout. I missed that these resources were in a group, so what was less clear to me was why at most two metadata actions (as part of probe actions) are running at a time. Two will run simultaneously, and another one won't start until we give up on one of those. That mostly makes sense now that I see they're grouped... although I would expect rsc2 to wait for rsc1 to time out, instead of going in batches of "probe rsc1 and rsc2, fail, probe rsc3 and rsc4, fail, ..." > Meta-data actions calling validation is suboptimal but relatively common. It > generally indicates the agent calls validation first, before even looking at > the action. > > A more important question is why are the meta-data actions > timing out; they aren't logging for more than a few seconds, but it appears > the agent isn't exiting. That's what I expected to find. However, the customer gave us the resource agent script and I don't see anything calling validate in the meta-data action. I attached the RA privately to this BZ. They also said those validation and "vg_sysid =" outputs are from the monitor operation. It looks that way to me too. So it's as though the metadata action isn't completing until the whole probe is complete. (This RA doesn't distinguish between probes and recurring monitors.) > I'm guessing these are just the system logs? The pacemaker.log might have > more detail about why the actions aren't completing. I'll upload the pacemaker.log files. Side note on the customer's situation -- since their RA's metadata depends on the output of `crm_node -l`, it adds two new required attributes to the metadata any time a node joins and runs the meta-data action... Observations: * The agent (via get_local_param() in sky-common.sh) assumes that uname -n is the local node name, but that's not necessarily true. It would be better to use crm_node -n. * The agent metadata includes <version>0.1</version>. The version element is intended for the version of the OCF standard that the metadata complies with (1.0 would be appropriate for the existing agent code), rather than the version of the agent itself, which would be specified like <resource-agent name="emc_srdf" version="0.1">. Recent versions of pcs require a known OCF version, so it could cause problems down the road. * The metadata action runs crm_node -l (via get_node_list()) to generate the node-name-dependent parameters. This makes sense for what the agent needs, but is problematic. Metadata should ideally be obtainable even if the cluster is not running, or on a machine that is not part of the cluster (for example, a host running a GUI front-end to the cluster). I don't have any suggestions, but ideally the node names would be parameter values somehow and not part of parameter names. * The crm_node command might be what's hanging the metadata. That could be verified by replacing crm_node -l with just echoing the node names in a test cluster. If that is the problem, adding -VVVVVV to the crm_node command (in a test cluster) would help narrow down what's wrong. * I suspect the monitor result timeouts are an artifact of the metadata calls being synchronous, and not a problem with the monitor itself. nodeA records the timeout in the CIB immediately after nodeB's timer pops. Increasing the monitor timeout by 5 seconds might avoid that particular log error, though it will still be recorded as a timeout, and it's possible stop would still fail and lead to fencing. Chad, can you pass along the comments about the agent code and monitor timeout? Would it be possible to reproduce this in a test cluster so we can see if crm_node is the problem? The fix for Bug 1850506 would give a much better recovery from metadata failure, and would probably avoid the monitor/stop failures and fencing here. We might close this as a duplicate of that. But figuring out why the metadata calls are timing out is the most important thing now. Now that I can look at this again, it's trivial to reproduce. Add `crm_node -l >/dev/null 2>&1` to the beginning of ocf:pacemaker:Dummy's meta_data() function. Then create a Dummy resource.
Environment: Pacemaker-2.1.4 (from an upstream main source build) on RHEL 9. (I have only one node online, for simplicity.)
# grep -A1 '^meta_data()' /usr/lib/ocf/resource.d/pacemaker/Dummy
meta_data() {
crm_node -l >/dev/null 2>&1
# date && pcs resource create dummy1 ocf:pacemaker:Dummy
Tue Sep 6 05:24:35 PM PDT 2022
Logs:
Sep 06 17:24:35.650 fastvm-rhel-9-0-42 pacemaker-controld [1859] (do_state_transition) notice: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
Sep 06 17:24:35.653 fastvm-rhel-9-0-42 pacemaker-schedulerd[1469] (determine_online_status) info: Node fastvm-rhel-9-0-42 is online
Sep 06 17:24:35.653 fastvm-rhel-9-0-42 pacemaker-schedulerd[1469] (log_list_item) notice: Actions: Start xvm ( fastvm-rhel-9-0-42 ) due to no quorum (blocked)
Sep 06 17:24:35.653 fastvm-rhel-9-0-42 pacemaker-schedulerd[1469] (log_list_item) notice: Actions: Start dummy1 ( fastvm-rhel-9-0-42 ) due to no quorum (blocked)
Sep 06 17:24:35.653 fastvm-rhel-9-0-42 pacemaker-schedulerd[1469] (pcmk__log_transition_summary) notice: Calculated transition 10, saving inputs in /var/lib/pacemaker/pengine/pe-input-123.bz2
Sep 06 17:24:35.664 fastvm-rhel-9-0-42 pacemaker-controld [1859] (do_state_transition) info: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_reply
Sep 06 17:24:35.664 fastvm-rhel-9-0-42 pacemaker-controld [1859] (do_te_invoke) info: Processing graph 10 (ref=pe_calc-pacemaker-controld-1662510275-10) derived from /var/lib/pacemaker/pengine/pe-input-123.bz2
Sep 06 17:24:35.664 fastvm-rhel-9-0-42 pacemaker-controld [1859] (execute_rsc_action) notice: Initiating monitor operation dummy1_monitor_0 locally on fastvm-rhel-9-0-42 | action 1
Sep 06 17:24:35.664 fastvm-rhel-9-0-42 pacemaker-controld [1859] (do_lrm_rsc_op) notice: Requesting local execution of probe operation for dummy1 on fastvm-rhel-9-0-42 | transition_key=1:10:7:4902cc63-ad7b-4837-a7c4-5f556ddfb178 op_key=dummy1_monitor_0
Sep 06 17:24:39.219 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:24:39.219 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 1 tries
Sep 06 17:24:40.656 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_process_ping) info: Reporting our current digest to fastvm-rhel-9-0-42: 386ca5445db08dabcb6e4ab8e1a85557 for 0.110.0 (0x558946d9bd50 0)
Sep 06 17:24:46.220 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:24:46.220 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 2 tries
Sep 06 17:24:53.220 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:24:53.220 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 3 tries
Sep 06 17:25:00.221 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:25:00.221 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 4 tries
Sep 06 17:25:05.695 fastvm-rhel-9-0-42 pacemaker-controld [1859] (wait_for_sync_result) info: Dummy_meta-data_0[1922] timed out after 30000ms
Sep 06 17:25:05.696 fastvm-rhel-9-0-42 pacemaker-controld [1859] (lrmd_api_get_metadata_params) error: Failed to retrieve meta-data for ocf:pacemaker:Dummy
Sep 06 17:25:05.697 fastvm-rhel-9-0-42 pacemaker-controld [1859] (controld_get_rsc_metadata) warning: Failed to get metadata for dummy1 (ocf:pacemaker:Dummy): Input/output error
Sep 06 17:25:05.698 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_process_request) info: Forwarding cib_modify operation for section status to all (origin=local/crmd/31)
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: Diff: --- 0.110.0 2
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: Diff: +++ 0.110.1 (null)
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: + /cib: @num_updates=1
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: ++ /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources: <lrm_resource id="dummy1" type="Dummy" class="ocf" provider="pacemaker"/>
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: ++ <lrm_rsc_op id="dummy1_last_0" operation_key="dummy1_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.16.0" transition-key="1:10:7:4902cc63-ad7b-4837-a7c4-5f556ddfb178" transition-magic="-1:193;1:10:7:4902cc63-ad7b-4837-a7c4-5f556ddfb178" exit-reason="" on_node="fastvm-rhel-9-0-42" call-id="-1" rc-code="193" op-status="-1" interval="0" last-rc-change="1662510275" exec-time="0" queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
Sep 06 17:25:05.699 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_perform_op) info: ++ </lrm_resource>
Sep 06 17:25:05.700 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_process_request) info: Completed cib_modify operation for section status: OK (rc=0, origin=fastvm-rhel-9-0-42/crmd/31, version=0.110.1)
Sep 06 17:25:05.702 fastvm-rhel-9-0-42 pacemaker-controld [1859] (pcmk__procfs_pid_of) info: Found pacemaker-based active as process 1465
Sep 06 17:25:05.702 fastvm-rhel-9-0-42 pacemaker-controld [1859] (throttle_send_command) info: New throttle mode: negligible load (was undetermined)
Sep 06 17:25:10.704 fastvm-rhel-9-0-42 pacemaker-based [1465] (cib_process_ping) info: Reporting our current digest to fastvm-rhel-9-0-42: 7d0fddd9c0de27d6137b64a850415c60 for 0.110.1 (0x558946d9bd50 0)
Sep 06 17:25:13.221 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:25:13.221 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 1 tries
Sep 06 17:25:20.220 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:25:20.220 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 2 tries
Sep 06 17:25:27.220 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:25:27.220 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 3 tries
Sep 06 17:25:34.220 fastvm-rhel-9-0-42 pacemakerd [1464] (pcmk__ipc_is_authentic_process_active) info: Could not connect to crmd IPC: timeout
Sep 06 17:25:34.220 fastvm-rhel-9-0-42 pacemakerd [1464] (check_next_subdaemon) notice: pacemaker-controld[1859] is unresponsive to ipc after 4 tries
Sep 06 17:25:35.731 fastvm-rhel-9-0-42 pacemaker-controld [1859] (wait_for_sync_result) info: Dummy_meta-data_0[1938] timed out after 30000ms
Sep 06 17:25:35.732 fastvm-rhel-9-0-42 pacemaker-controld [1859] (lrmd_api_get_metadata_params) error: Failed to retrieve meta-data for ocf:pacemaker:Dummy
Sep 06 17:25:35.732 fastvm-rhel-9-0-42 pacemaker-controld [1859] (controld_get_rsc_metadata) warning: Failed to get metadata for dummy1 (ocf:pacemaker:Dummy): Input/output error
Sep 06 17:25:35.732 fastvm-rhel-9-0-42 pacemaker-controld [1859] (log_executor_event) notice: Result of probe operation for dummy1 on fastvm-rhel-9-0-42: not running | CIB update 32, graph action confirmed; call=39 key=dummy1_monitor_0 rc=7
No further information is needed from the customer. Since upstream Pacemaker release 2.1.0 (RHEL 8.5 and 9.0 beta), crm_node -l gets the node list via the controller, rather than querying Corosync directly. Since metadata calls are executed synchronously in the controller, metadata calls block all other activity by the controller. Thus, if the metadata action runs crm_node, crm_node will be unable to connect to the controller, and the metadata action will hang until it times out. The long-term solution in Bug 1850506 will be better, but for a quicker fix, I'm thinking we can make the metadata call asynchronous without the rest of that solution. The metadata call would still eat into the probe timeout, but since the controller could handle the crm_node connection, it can proceed and should complete quickly. A workaround would be to avoid calling crm_node for the metadata action. QA: Reproducer: * Configure a cluster. * Edit or copy /usr/lib/ocf/resource.d/pacemaker/Dummy and add "crm_node -l" to the metadata action. * Create a resource using Dummy (or the copy). Actual results: The cluster is hung for 30 seconds and then the controller logs a metadata action timeout. Expected results: The cluster proceeds normally. Fixed in upstream main branch as of commit bc852fe3 Tested using reproducer from Comment 12 and Comment 15. after fix: --------- > [root@virt-507 ~]# rpm -q pacemaker > pacemaker-2.1.5-4.el8.x86_64 Configure a cluster: > [root@virt-507 ~]# pcs status > Cluster name: STSRHTS29018 > Status of pacemakerd: 'Pacemaker is running' (last updated 2022-12-19 13:51:03 +01:00) > Cluster Summary: > * Stack: corosync > * Current DC: virt-507 (version 2.1.5-4.el8-a3f44794f94) - partition with quorum > * Last updated: Mon Dec 19 13:51:03 2022 > * Last change: Mon Dec 19 13:24:06 2022 by root via cibadmin on virt-507 > * 2 nodes configured > * 2 resource instances configured > Node List: > * Online: [ virt-507 virt-508 ] > Full List of Resources: > * fence-virt-507 (stonith:fence_xvm): Started virt-507 > * fence-virt-508 (stonith:fence_xvm): Started virt-508 > Daemon Status: > corosync: active/enabled > pacemaker: active/enabled > pcsd: active/enabled Edit /usr/lib/ocf/resource.d/pacemaker/Dummy and add "crm_node -l" to the metadata action: > [root@virt-507 ~]# grep -A1 '^meta_data()' /usr/lib/ocf/resource.d/pacemaker/Dummy > meta_data() { > crm_node -l >/dev/null 2>&1 Create a resource using Dummy: > [root@virt-507 ~]# date && pcs resource create dummy2 ocf:pacemaker:Dummy > Mon 19 Dec 13:57:07 CET 2022 > [root@virt-507 ~]# pcs resource > * dummy2 (ocf::pacemaker:Dummy): Started virt-507 > [root@virt-507 ~]# tail -f /var/log/messages > Dec 19 13:57:08 virt-507 pacemaker-controld[101660]: notice: State transition S_IDLE -> S_POLICY_ENGINE > Dec 19 13:57:09 virt-507 pacemaker-schedulerd[101659]: notice: Actions: Start dummy2 ( virt-507 ) > Dec 19 13:57:09 virt-507 pacemaker-schedulerd[101659]: notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-4.bz2 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Initiating monitor operation dummy2_monitor_0 on virt-508 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Initiating monitor operation dummy2_monitor_0 locally on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Requesting local execution of probe operation for dummy2 on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Result of probe operation for dummy2 on virt-507: not running > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Initiating start operation dummy2_start_0 locally on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Requesting local execution of start operation for dummy2 on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Result of start operation for dummy2 on virt-507: ok > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Initiating monitor operation dummy2_monitor_10000 locally on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Requesting local execution of monitor operation for dummy2 on virt-507 > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Result of monitor operation for dummy2 on virt-507: ok > Dec 19 13:57:09 virt-507 pacemaker-controld[101660]: notice: Transition 5 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-4.bz2): Complete Result: The cluster proceeds normally. marking VERIFIED in pacemaker-2.1.5-4.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2818 |