Bug 1790009
| Summary: | before-acquire-handler does not run at other site after ticket release [RHEL 8] | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Reid Wahl <nwahl> | |
| Component: | booth | Assignee: | Jan Friesse <jfriesse> | |
| Status: | ASSIGNED --- | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | high | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 8.1 | CC: | cluster-maint, jobaker, jss, sbradley | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | 8.2 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1790118 (view as bug list) | Environment: | ||
| Last Closed: | Type: | Bug | ||
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1790118 | |||
|
Description
Reid Wahl
2020-01-11 04:26:27 UTC
Thanks for the report, this behaviour is counter-intuitive, indeed. Lets separate two mutually-amplifying phenomena here: 1. back then, the handlers (executable predicates) were executed synchronously, which appears more intuitive than what we have now - https://github.com/ClusterLabs/booth/commit/7c30fb4c8580f9784d75bc4dc27223c63491d3df - https://github.com/ClusterLabs/booth/commit/5575f95d032116c93d13eb84d71f80604ad6d90d 2. for the scenario at hand (leader refrains from it's role for particular ticker, here because a violation with "shall it be held like that, to begin with" predicate) * * * ad 1.: It is likely well-meant and based on the assumption that the detection of precondition failure will never be timely (subject to polling in renewal-freq periods), so the system must exercise some tolerance to non-swiftness already, hence blocking (possibly forever) till the verdict is known is not all that justifiable when it can be acted upon later on, while the state keeps on unfolding without blocking. * * * ad 2.: I tried to crank up an easy patch that might possibly impose immediate on-failover rechecking via configured executable predicate [1]. Scratch build that will disappear in a while: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=25742150 Would be grateful if you can give it a test drive, Ried. While there still may (and likely will) be an intermittent flip for the ticket location, it shall settle to the actually expected state (ticket without leader, i.e., not granted anywhere) in the order of lower seconds as long as the script finishes "immediately" (and it shall in this case). * * * [1] this patch: commit 1183161e359dc5aed4dee2b7ee16cdae4d532d38 Author: Jan Pokorný <jpokorny> Date: Mon Jan 13 20:59:04 2020 +0100 problem minimizing: ticket failover re-evaluated "ASAP" Related: https://bugzilla.redhat.com/show_bug.cgi?id=1790009 Signed-off-by: Jan Pokorný <jpokorny> diff --git a/src/raft.c b/src/raft.c index 462fc3b..7ae043f 100644 --- a/src/raft.c +++ b/src/raft.c @@ -146,8 +146,14 @@ static void won_elections(struct ticket_config *tk) save_committed_tkt(tk); +#if 1 + /* actually, only schedule election if before-acquire-handler fails, + the point of this is to cause wakeup and re-evaluation */ + schedule_election(tk, OR_LOCAL_FAIL); +#else ticket_broadcast(tk, OP_HEARTBEAT, OP_ACK, RLT_SUCCESS, 0); tk->ticket_updated = 0; +#endif } > it shall settle to the actually expected state (ticket without leader, > i.e., not granted anywhere) in the order of lower seconds as long as > the script finishes "immediately" (and it shall in this case). as opposed to after "renewal-freq" period (here, 60 seconds) as can be observed before the change, from [comment 0]: > Jan 10 20:08:38 fastvm-rhel-8-0-33 boothd-site[20210]: [info] > apacheticket (Lead/20/119999): granted successfully here > [...] > Jan 10 20:09:38 fastvm-rhel-8-0-33 boothd-site[20210]: [warning] > apacheticket (Lead/20/59808): handler "/tmp/before-acquire-handler.sh" > failed: exit code 1 > Scratch build that will disappear in a while: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=25742150 also, conveniently: http://brew-task-repos.usersys.redhat.com/repos/scratch/jpokorny/booth/1.0/6.poki.f2d38ce.git.el8_1/booth-1.0-6.poki.f2d38ce.git.el8_1-scratch.repo (In reply to Jan Pokorný [poki] from comment #1) > ad 2.: > > I tried to crank up an easy patch that might possibly impose immediate > on-failover rechecking via configured executable predicate [1]. > > Scratch build that will disappear in a while: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=25742150 > > Would be grateful if you can give it a test drive, Ried. > > While there still may (and likely will) be an intermittent flip for > the ticket location, it shall settle to the actually expected state > (ticket without leader, i.e., not granted anywhere) in the order > of lower seconds as long as the script finishes "immediately" > (and it shall in this case). The test run went well. As you can see below, site 2 acquired the ticket for approximately one second, after which time the handler failed. Jan 13 13:18:54 fastvm-rhel-8-0-33 boothd-site[12872]: [info] apacheticket (Fllw/22/58742): 192.168.22.71 wants to give the ticket away (ticket release) Jan 13 13:18:54 fastvm-rhel-8-0-33 boothd-site[12872]: [info] setting crm_ticket attributes successful Jan 13 13:18:55 fastvm-rhel-8-0-33 boothd-site[12872]: [info] apacheticket (Fllw/23/0): starting new election (term=23) Jan 13 13:18:55 fastvm-rhel-8-0-33 boothd-site[12872]: [info] apacheticket (Lead/23/119999): granted successfully here Jan 13 13:18:56 fastvm-rhel-8-0-33 boothd-site[12872]: [warning] apacheticket (Lead/23/119363): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 13 13:18:56 fastvm-rhel-8-0-33 boothd-site[12872]: [warning] apacheticket (Lead/23/119362): we are not allowed to acquire ticket Jan 13 13:18:56 fastvm-rhel-8-0-33 boothd-site[12872]: [info] setting crm_ticket attributes successful This is certainly an improvement, as it stops site 2 in its tracks almost immediately. I can have the customer test the RPMs for his use case, if you would like. For his use case, and especially for the general case, I'm not sure this will suffice. (Perhaps you only mean this patch as a testing/mitigation step anyway.) One possibility that comes to mind is that a cluster resource may fail a stop operation after probe, triggering fencing, because its underlying physical or software resource is missing (e.g., a physical LUN is not presented or an app script's symlink is removed until the site is activated). One could reasonably argue that a resource prone to such behavior should be designed better, but we're all flawed humans. For this and perhaps other situations that we may not be able to anticipate, it seems we would still want the handler to run prior to ticket acquisition. > For his use case, and especially for the general case, I'm not sure > this will suffice. Oh, I totally see the practical weakness of the guarantee, now arising primarily from what I've mentioned as 1. above. First instinct was to close the gap 2., and it was rather successful per your confirmation, thanks. > (Perhaps you only mean this patch as a testing/mitigation step > anyway.) Yep. I was looking around some more, could you please try configuring "require-after = 5" (or so, rule of thumb perhaps being "expected time to execute the script/s + 5 s") for the ticket at hand, please? There's some good hope it might actually be the thing that would postpone (true, antithesis to timely place-ticket-anywhere resolution, but OTOH making it safe, at least in theory, regarding these asynchronous can-move-there predicate evaluations, as we know, HA solutions are based on a fitting tweaks to trade-offs exactly like this). It'd be an utter win if we could make do just with that easy patch and extra discretion at the configuration side. Oops. "acquire-after", indeed :) (In reply to Jan Pokorný [poki] from comment #5) > I was looking around some more, could you please try configuring > "require-after = 5" (or so, rule of thumb perhaps being "expected > time to execute the script/s + 5 s") for the ticket at hand, please? No luck with acquire-after = 5 :( Jan 14 09:41:41 fastvm-rhel-8-0-33 boothd-site[15814]: [info] apacheticket (Fllw/23/58734): 192.168.22.71 wants to give the ticket away (ticket release) Jan 14 09:41:41 fastvm-rhel-8-0-33 boothd-site[15814]: [info] setting crm_ticket attributes successful Jan 14 09:41:41 fastvm-rhel-8-0-33 boothd-site[15814]: [info] apacheticket (Fllw/24/0): starting new election (term=24) Jan 14 09:41:41 fastvm-rhel-8-0-33 boothd-site[15814]: [info] apacheticket (Lead/24/119999): granted successfully here Jan 14 09:41:42 fastvm-rhel-8-0-33 boothd-site[15814]: [warning] apacheticket (Lead/24/119293): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 14 09:41:42 fastvm-rhel-8-0-33 boothd-site[15814]: [warning] apacheticket (Lead/24/119293): we are not allowed to acquire ticket Jan 14 09:41:42 fastvm-rhel-8-0-33 boothd-site[15814]: [info] setting crm_ticket attributes successful Jan 14 09:41:52 fastvm-rhel-8-0-33 boothd-site[15814]: [warning] apacheticket (Init/24/0): no answers to our VtFr request to step down (try #1), we are alone Then I bumped it up to 30, with no change. For this run, I also removed `timeout = 10` from my booth.conf in case that was interfering. Jan 14 09:45:26 fastvm-rhel-8-0-33 boothd-site[17675]: [info] apacheticket (Fllw/24/58722): 192.168.22.71 wants to give the ticket away (ticket release) Jan 14 09:45:26 fastvm-rhel-8-0-33 boothd-site[17675]: [info] setting crm_ticket attributes successful Jan 14 09:45:26 fastvm-rhel-8-0-33 boothd-site[17675]: [info] apacheticket (Fllw/25/0): starting new election (term=25) Jan 14 09:45:26 fastvm-rhel-8-0-33 boothd-site[17675]: [info] apacheticket (Lead/25/119999): granted successfully here Jan 14 09:45:27 fastvm-rhel-8-0-33 boothd-site[17675]: [warning] apacheticket (Lead/25/118871): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 14 09:45:27 fastvm-rhel-8-0-33 boothd-site[17675]: [warning] apacheticket (Lead/25/118871): we are not allowed to acquire ticket Jan 14 09:45:27 fastvm-rhel-8-0-33 boothd-site[17675]: [info] setting crm_ticket attributes successful Jan 14 09:45:32 fastvm-rhel-8-0-33 boothd-site[17675]: [warning] apacheticket (Init/25/0): no answers to our VtFr request to step down (try #1), we are alone It was worth a shot, and I'm willing to test other parameters if you identify anything you think would help. Maybe implementing something like this client option within the source for this auto-acquisition? Or could be totally irrelevant :) -C wait for ticket commit to CIB: The client waits for the ticket commit to CIB (only for grant requests). If one or more sites are unreachable, this takes the ticket expire time (plus, if defined, the acquire-after time). Ried, no problem, thanks for trying. One more question though, since this is something not exposed in the logs directly: was it just declaratively asserting ticket for a bit, or was that information reflected on the pacemaker side as well? Asking since it wouldn't matter what boothd logged, then. Just the tangible effect is what counts. (In reply to Jan Pokorný [poki] from comment #8) > Ried, no problem, thanks for trying. > > One more question though, since this is something > not exposed in the logs directly: > > was it just declaratively asserting ticket for a bit, > or was that information reflected on the pacemaker > side as well? > > Asking since it wouldn't matter what boothd logged, then. > Just the tangible effect is what counts. This is the entirety of what Pacemaker logged on fastvm-rhel-8-0-33 at site 2 (the node that acquired the ticket in booth): Time wasn't synced with chrony on site 2 so I ran another test. Here are the logs: Site 1: /var/log/pacemaker/pacemaker.log: Jan 14 12:12:45 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Init/25/0): granting ticket Jan 14 12:12:45 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Init/25/0): starting new election (term=25) Jan 14 12:12:45 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Lead/25/119999): granted successfully here Jan 14 12:12:45 fastvm-rhel-8-0-23 crm_ticket [31681] (crm_log_args) notice: Invoked: crm_ticket -t apacheticket -S owner -v 1950506022 ... Jan 14 12:12:46 fastvm-rhel-8-0-23 crm_ticket [31693] (crm_log_args) notice: Invoked: crm_ticket -t apacheticket -S SAFE_TO_ACTIVATE -v 0 ... Jan 14 12:13:46 fastvm-rhel-8-0-23 booth: [31463]: warning: apacheticket (Lead/25/59826): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 14 12:13:46 fastvm-rhel-8-0-23 booth: [31463]: warning: apacheticket (Lead/25/59826): we are not allowed to acquire ticket Jan 14 12:13:46 fastvm-rhel-8-0-23 crm_ticket [32190] (crm_log_args) notice: Invoked: crm_ticket -t apacheticket -S owner -v 0 ... Jan 14 12:13:46 fastvm-rhel-8-0-23 booth: [31463]: info: setting crm_ticket attributes successful ... Jan 14 12:13:46 fastvm-rhel-8-0-23 crm_ticket [32193] (crm_log_args) notice: Invoked: crm_ticket -t apacheticket -r --force ... Jan 14 12:13:46 fastvm-rhel-8-0-23 crm_ticket [32193] (modify_ticket_state) debug: Update <ticket_state id="apacheticket" SAFE_TO_ACTIVATE="0" owner="0" expires="1579032826" term="25" granted="false" last-granted="1579032765"/> ... Jan 14 12:13:47 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Fllw/26/0): 192.168.22.81 votes for none, ignoring (duplicate ticket release?) Jan 14 12:13:52 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Fllw/26/0): 192.168.22.81 votes for none, ignoring (duplicate ticket release?) ... Jan 14 12:13:56 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Fllw/26/0): starting new election (term=26) Jan 14 12:13:56 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Lead/26/119999): granted successfully here Jan 14 12:13:56 fastvm-rhel-8-0-23 booth: [31463]: warning: apacheticket (Lead/26/119567): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 14 12:13:56 fastvm-rhel-8-0-23 booth: [31463]: warning: apacheticket (Lead/26/119567): we are not allowed to acquire ticket Jan 14 12:13:56 fastvm-rhel-8-0-23 crm_ticket [32268] (crm_log_args) notice: Invoked: crm_ticket -t apacheticket -S owner -v 0 ... Jan 14 12:13:56 fastvm-rhel-8-0-23 booth: [31463]: info: setting crm_ticket attributes successful ... Jan 14 12:13:57 fastvm-rhel-8-0-23 booth: [31463]: info: apacheticket (Init/26/0): candidate status lost, ignoring VtFr from 192.168.22.81 Site 2: /var/log/pacemaker/pacemaker.log: Jan 14 12:12:45 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @owner=1950506022 Jan 14 12:12:45 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @expires=1579032885 Jan 14 12:13:46 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @owner=0 Jan 14 12:13:46 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @expires=1579032826 Jan 14 12:13:47 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @expires=1579032827 Jan 14 12:13:47 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_perform_op) info: + /cib/status/tickets/ticket_state[@id='apacheticket']: @term=26 Jan 14 12:13:47 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_process_request) info: Completed cib_modify operation for section status: OK (rc=0, origin=fastvm-rhel-8-0-33/crm_ticket/4, version=0.98.10) Jan 14 12:13:47 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_process_request) info: Forwarding cib_modify operation for section status to all (origin=local/crm_ticket/4) Jan 14 12:13:47 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_process_request) info: Completed cib_modify operation for section status: OK (rc=0, origin=fastvm-rhel-8-0-33/crm_ticket/4, version=0.98.10) Jan 14 12:13:52 fastvm-rhel-8-0-33 pacemaker-based [12641] (cib_process_ping) info: Reporting our current digest to fastvm-rhel-8-0-34: 99874a5333cc60295ba99081fd28e4ac for 0.98.10 (0x558dcc339010 0) /var/log/messages: Jan 14 12:12:45 fastvm-rhel-8-0-33 boothd-site[9789]: [info] setting crm_ticket attributes successful Jan 14 12:13:46 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Fllw/25/58738): 192.168.22.71 wants to give the ticket away (ticket release) Jan 14 12:13:46 fastvm-rhel-8-0-33 boothd-site[9789]: [info] setting crm_ticket attributes successful Jan 14 12:13:46 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Fllw/26/0): starting new election (term=26) Jan 14 12:13:46 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Lead/26/119999): granted successfully here Jan 14 12:13:47 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Lead/26/119485): handler "/tmp/before-acquire-handler.sh" failed: exit code 1 Jan 14 12:13:47 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Lead/26/119485): we are not allowed to acquire ticket Jan 14 12:13:47 fastvm-rhel-8-0-33 boothd-site[9789]: [info] setting crm_ticket attributes successful Jan 14 12:13:52 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Init/26/0): no answers to our VtFr request to step down (try #1), we are alone Jan 14 12:13:56 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Init/26/0): 192.168.22.71 voted previously for 192.168.22.81 and now wants to vote for 192.168.22.71 (ignored) Jan 14 12:13:56 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): 192.168.22.71 votes for none, ignoring (duplicate ticket release?) Jan 14 12:13:57 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Init/26/0): no answers to our VtFr request to step down (try #2), we are alone Jan 14 12:14:01 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): 192.168.22.71 votes for none, ignoring (duplicate ticket release?) Jan 14 12:14:02 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Init/26/0): no answers to our VtFr request to step down (try #3), we are alone Jan 14 12:14:06 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): 192.168.22.71 votes for none, ignoring (duplicate ticket release?) Jan 14 12:14:07 fastvm-rhel-8-0-33 boothd-site[9789]: [warning] apacheticket (Init/26/0): no answers to our VtFr request to step down (try #4), we are alone Jan 14 12:14:11 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): 192.168.22.71 votes for none, ignoring (duplicate ticket release?) Jan 14 12:14:12 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): giving up on sending retries Jan 14 12:14:16 fastvm-rhel-8-0-33 boothd-site[9789]: [info] apacheticket (Init/26/0): 192.168.22.71 votes for none, ignoring (duplicate ticket release?) Make of that what you will. As far as resources are concerned, I see no indication either in node 33's logs or in the schedulerd logs of node 34 (the standby'd DC node) that the ticket-constrained resource tried to start or was even allowed to start. All the schedulerd messages continued to say "cannot run anywhere." On the first sight, it looks positively to me, and might be a gist
of the proper solution ... assuming pre-commit delay is configured
in a way it will reliably cover[*] for the natural delay till the
before-acquire script or scripts finish(es).
[*] "reliably cover" is a sole liability and responsibility of the
users, indeed, since only they have an idea how long the execution
of their own executables will take
(In reply to Jan Pokorný [poki] from comment #10) > On the first sight, it looks positively to me, and might be a gist > of the proper solution ... assuming pre-commit delay is configured > in a way it will reliably cover[*] for the natural delay till the > before-acquire script or scripts finish(es). > > [*] "reliably cover" is a sole liability and responsibility of the > users, indeed, since only they have an idea how long the execution > of their own executables will take I agree, as long as the script does not require an abnormally long time, etc., on any particular execution. It's not uncommon for users to be surprised by a script taking longer to complete than it otherwise should, when something is going wrong either on that system or elsewhere in the environment (e.g., storage or networking). I'd hate for that to end with a node getting fenced due to stop failure after being improperly granted a ticket, when it otherwise would remain online. This is a big improvement already and is likely to work in a great many situations -- perhaps it will even work in *ALL* situations since Pacemaker does not appear to probe resources prematurely. I have to admit I don't understand the potential problem with a more synchronous check for a given ticket's before-acquire-handler described in comment #1, provided there is some timeout value after which we give up. In general, I'd prefer that something work correctly with as little user discretion and tuning as possible, lest they miss documentation or introduce a mistake. poki++ *** Bug 1761298 has been marked as a duplicate of this bug. *** |