Bug 1389209
| Summary: | [TechPreview] add support for managing qdevice heuristics | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Tomas Jelinek <tojeline> | ||||
| Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 7.3 | CC: | cfeist, cluster-maint, idevat, jfriesse, jobaker, jpokorny, omular, rsteiger, tojeline | ||||
| Target Milestone: | rc | Keywords: | TechPreview | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | pcs-0.9.162-1.el7 | Doc Type: | No Doc Update | ||||
| Doc Text: |
Use documentation from corosync bug 1413573.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-04-10 15:37:49 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1413573 | ||||||
| Bug Blocks: | 1280348 | ||||||
| Attachments: |
|
||||||
|
Description
Tomas Jelinek
2016-10-27 08:05:42 UTC
Tomáši, for the sake of user's convenience, my hopes are that one will be able to specify commands to execute without artificial key qualifiers like "exec_foo=test -f /var/run/.ultimatelock", but the rest of the command-line not in -*, --* or exec_*=* (like mere "test -f /var/run/.ultimatelock") will be interpreted directly as the heuristic commands, for which the respective exec_* keys will get assigned automaticall by pcs, like it is done for various in-CIB identifiers already. Nonetheless, exposing these keys later on will be needed so that one can selectively drop particular commands. re [comment 5]: Actually, replace "exec_*=*" above with "{exec_*,mode,interval,...}=*", you get the point. (In reply to Jan Pokorný from comment #5) > Tomáši, > > for the sake of user's convenience, my hopes are that one will be > able to specify commands to execute without artificial key qualifiers > like "exec_foo=test -f /var/run/.ultimatelock", but the rest of > the command-line not in -*, --* or exec_*=* (like mere > "test -f /var/run/.ultimatelock") will be interpreted directly > as the heuristic commands, for which the respective exec_* keys > will get assigned automaticall by pcs, like it is done for > various in-CIB identifiers already. > > Nonetheless, exposing these keys later on will be needed so that > one can selectively drop particular commands. They are not artificial. It is how corosync does it and documents it, so pcs can refer to corosync manpage for explanation. Users need to refer to commands using those keys anyway for updating and removing them. Your proposal would make the add command syntax inconsistent with the update command syntax. Moreover it would make it harder to extend the command in the future. Where only to start: 1. they are indeed artificial from user's POV, they bring nothing semantically important, this is just a syntax hack to make it fit the key-value way of configuring corosync, it could have been sort of an array just as well (with index being the key then) 2. yes, they need to refer to them *only* when updating/removing, but you have the same situation at various places when configuring CIB already (default way of creating location constraints will not allow you to provide a custom identifier for that constraint, for instance), but you can get them easily if their listing does the right job of stating them along 3. "update" syntax is rather a nonsense here, you'll want to add a new command instead of the existing one? fine, drop the old one, add new one... 4. you basically say you'll rather sacrifice user's convenience in favour of easy implementation, "nice" Note that the situation would be different if the commands themselves had some properties attached to them, which is not the case, though. re [comment 8]: > 3. "update" syntax is rather a nonsense here, you'll want to add > a new command instead of the existing one? fine, drop the old > one, add new one... Apparently, that's a bad piece of advice as the command to remove may be the last one (which pcs should guard against properly using the --force convenience?), so a fixed recipe: add new one, drop old one, done. re [comment 11]: Actually, for the sake of highly desirably atomicity (no extra reloads and related complications), this idea seems best to me for situations like that: pcs quorum device heuristics [for <model>] replace exec_<id> \ [exec_<id>...] with <raw-command-or-exec-assignment> \ [<raw-command-or-exec-assignment>...] (In reply to Jan Pokorný from comment #8) > Where only to start: > > 1. they are indeed artificial from user's POV, they bring nothing > semantically important, this is just a syntax hack to make it I must partly agree and partly disagree. "exec_" part brings nothing, that's for sure. But the name after the underscore gives user ability to name heuristics command as they like and (not yet implemented) refer to it by these name. So in future qdevice-tool will display more detailed statistics like (this is example, precise format yet to define): Exec Result (5 sec ago) Result (10 sec ago) Result (25 sec ago) ---- ------------------ ------------------- ------------------- test Pass Pass Fail foo Fail Pass Unknown ping Unknown Pass Unknown ... So how exactly should be the name part autogenerated? > fit the key-value way of configuring corosync, it could have > been sort of an array just as well (with index being the key > then) What would not allow user to have short, easy to remember name of long command and it is exactly the reason why it's not implemented that way. Actually from my POV this is much cleaner then what is happening with nodelist (what I would call syntax hack). > So in future qdevice-tool will display more detailed statistics like (this is > example, precise format yet to define): > > Exec Result (5 sec ago) Result (10 sec ago) Result (25 sec ago) > ---- ------------------ ------------------- ------------------- > test Pass Pass Fail > foo Fail Pass Unknown > ping Unknown Pass Unknown Ah, I see, this was an untold story so far. Still, some would prefer the command-line shown in full, especially when checking the output only in case of troubles, it would help to uncover stale ping target no longer and what not right away. > So how exactly should be the name part autogenerated? Excerpt from the command's digest, for instance. My point was, just as noone is interested in the index column in the table of a relational database (such as on insert ~ definition of heuristic command) unless there's a need to update/drop or some other place need to refer to it unambiguously, except the inambiguity constraint is likely to be relaxed here as defining two exactly same heuristics does make no sense and should be marked an error by the configuration tool. Then, if you'd need to drop particular heuristic command, you could either provide it in full (style of iproute2's address removal from the interface) or look the "index" up by the means of detailed list command. Current design requires one to provide the "index" unconditionally (yes, it should be able to do so), and I am asking for the convenient way be available as well, just as it currently exists with various CIB configuring commands likewise allowing you to avoid the identifiers that need to be present under the hood, but in basic scenarios, they are just usability killing burden on the user, like if the overall configuration wasn't hard to grasp already. Created attachment 1351610 [details]
proposed fix + tests
Updated commands:
* adding a quorum device:
pcs quorum device add [<generic options>] model <device model> [<model options>] [heuristics <heuristics options including exec_ list>]
* changing a quorum device:
pcs quorum device update [<generic options>] [model <model options>] [heuristics <heuristics options including exec_ list>]
* displaying configuration:
pcs quorum config
New commands:
* remove all heuristics settings:
pcs quorum device heuristics remove
(In reply to Jan Pokorný from comment #8) > 4. you basically say you'll rather sacrifice user's convenience > in favour of easy implementation, "nice" No, I do not. (In reply to Jan Pokorný from comment #12) > re [comment 11]: > > pcs quorum device heuristics [for <model>] replace exec_<id> \ > [exec_<id>...] with <raw-command-or-exec-assignment> \ > [<raw-command-or-exec-assignment>...] So you do not like pcs quorum device update heuristics exec_<id>=<command> because it has this exec_<id> part. Now your proposed command has it as well. On top of that it presents a whole new syntax inconsistent with the rest of pcs. And how do you cover adding, removing and changing heuristics and other quorum device settings in one command / step with this new command? Because that is what the current command does. What I see as a possible improvement: pcs quorum device add [<generic options>] model <device model> [<model options>] [heuristics <heuristics options> exec <id>=<command> [<id>=<command>]...] which is not that different. And it may cause a confusion due to discrepancy with corosync manpages. Being able to find documentation and map it to the command is from my point of view more important than saving some typing. (In reply to Jan Pokorný from comment #14) > > So in future qdevice-tool will display more detailed statistics like (this is > > example, precise format yet to define): > > > > Exec Result (5 sec ago) Result (10 sec ago) Result (25 sec ago) > > ---- ------------------ ------------------- ------------------- > > test Pass Pass Fail > > foo Fail Pass Unknown > > ping Unknown Pass Unknown > > Ah, I see, this was an untold story so far. > > Still, some would prefer the command-line shown in full, especially when That is expected to be somewhere. > checking the output only in case of troubles, it would help to uncover > stale ping target no longer and what not right away. Output of command is nice to have for sure. > > > So how exactly should be the name part autogenerated? > > Excerpt from the command's digest, for instance. Ugh. Digest may be ok for computer but I don't see it as a good name for "user" to use. I just don't believe there is really majority of people who likes names like 24f77cd2b40d037670548cc08031c36b0269b6c6dc6cc5a9d78be2193cae181a495fb15307c099b8fb6cf352a4cce86276193550b5ece69ce1468997e35281a1 (this is real command, are you able to find out what it is?) more than ping (or ping_localhost). > > > My point was, just as noone is interested in the index column in the table of > a relational database (such as on insert ~ definition of heuristic command) > unless there's a need to update/drop or some other place need to refer to it > unambiguously, except the inambiguity constraint is likely to be relaxed here > as defining two exactly same heuristics does make no sense and should be > marked > an error by the configuration tool. Then, if you'd need to drop particular > heuristic command, you could either provide it in full (style of iproute2's > address removal from the interface) or look the "index" up by the means of > detailed list command. Full command is not that great idea because what if there are two equal commands (what of course doesn't make too much sense as long as command is not doing some kind of locking to be able to do it's internal ordering = magic)? > > Current design requires one to provide the "index" unconditionally (yes, it > should be able to do so), and I am asking for the convenient way be available > as well, just as it currently exists with various CIB configuring commands > likewise allowing you to avoid the identifiers that need to be present under > the hood, but in basic scenarios, they are just usability killing burden > on the user, like if the overall configuration wasn't hard to grasp already. In [comment 17]: > (In reply to Jan Pokorný from comment #8) >> 4. you basically say you'll rather sacrifice user's convenience >> in favour of easy implementation, "nice" >> > No, I do not. That's debatable for simple use cases. * * * > So you do not like [...] because it has this exec_<id> part. Now your > proposed command has it as well. Not sure I am understood, in comment [14] I claim it's OK to have it supported, but that alone is not sufficient for the least effort configuration: > Current design requires one to provide the "index" unconditionally > (yes, it should be able to do so), and I am asking for the convenient > way be available as well, * * * > On top of that it presents a whole new syntax inconsistent with the > rest of pcs. Sure, simply justified with the fact that while you can batch the changes in CIB easily (though there's a room for simplification, at least for the purpose of coherent scripting: [bug 1359057]), hence making desired "replace foo with bar" automic from the perspective of committed changes, there's no such a provision for corosync configuration. For that reason, you'd preferably be able to do that in one sweep, and then it's OK to bring the new configuration syntax pattern in. > And how do you cover adding, removing and changing heuristics and > other quorum device settings in one command / step with this new > command? Because that is what the current command does. Ok, I admit I thought you were originally talking about an update with the granularity of particular heuristic command in [comment 7]. It would indeed have been a heavy overkill assuming there are no real properties of the heuristic command, but also one of the very few reasons to make users face the artificial identifier in the first place. The current update command could as well follow the idea from [comment 12]: pcs quorum device update [<generic options>] [model <model options>] [heuristics <heuristics options>] [{ <raw-command-or-exec-assignment> | drop <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] | replace <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] with <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...]}] where <raw-command-or-exec-assignment> encapsulates { exec_<id>=<command> | command } Example: pcs quorum device update heuristics replace 'ping -qc1 lokal' \ with 'ping -qc1 remotal' 'ping -qc1 kontejneral' with the obvious meaning. First branch of heuristics options is used merely for adding new, the second one for removing existing, and replace-with one for combining the former. Note that this first branch could be fully compatible with with "pcs quorum device add", and it's obvious one cannot drop nor replace when initially adding the device as such, and it should not ever happen that "drop" or "replace" (verbatim) are the intended programs to run -- for systemd enabled systems using stock corosync-qdevice.service, it would actually mean "/drop" and "/replace" executables, respectively, and any sane admin hardly ever reaches that. Ditto for "/exec_*=*" possible ambiguity... * * * > What I see as a possible improvement That doesn't touch the principial problem I raised, and agreed, this would not be for the better. re [comment 18]: >>> So how exactly should be the name part autogenerated? >> >> Excerpt from the command's digest, for instance. > > Ugh. Digest may be ok for computer but I don't see it as a good name > for "user" to use. I just don't believe there is really majority of > people who likes names like > 24f77cd2b40d037670548cc08031c36b0269b6c6dc6cc5a9d78be2193cae181a495fb15307c099b8fb6cf352a4cce86276193550b5ece69ce1468997e35281a1 Ugh. "Excerpt" word must have been missed, just a tiny part of the real digest was meant, and it can be base64-encoded form of the raw digest bytes. Hence rather something like exec_yOhm (for "/usr/bin/ping -qc1 localhost"). > (this is real command, are you able to find out what it is?) more than > ping (or ping_localhost). That's what I'd use detailed heuristic commands listing for, but would rather prefer if raw commands where the default when facing the user in the first place :) Even in the logs it would be preferable to have the commands shown in full, for auditability reasons (to know what got executed when, as the configuration can change in time while perhaps the identifiers can stay the same). (FTR, this is where docker/moby friendliness excel -- need to uniquely identify the running container instance? we won't bother the user to crack her head figuring out more or less transient identifier, we'll just combine two easy-to-remember wordlist items, adjective + well known scientist IIRC, and the user will then figure out this artificial identifier easily when required as an input to other commands; IMHO, we should strive for these simplifications in user interface as well where it makes sense, getting beyond throwing the minuscule internal details on the user at any opportunity) * * * >> except the inambiguity constraint is likely to be relaxed here as >> defining two exactly same heuristics does make no sense and should be >> marked an error by the configuration tool. Then, if you'd need to >> drop particular heuristic command, you could either provide it in >> full (style of iproute2's address removal from the interface) or look >> the "index" up by the means of detailed list command. > > Full command is not that great idea because what if there are two > equal commands Again, it was likely missed that exactly the same commands defined as heuristics should be actively prevented on creation, and the replace/drop logic should be ready to perform a deduplication when such a case witnessed (e.g. coming from a manual edit). My reasoning behind that is the documention, corosync-qdevice(8), does not specify (amongst other things) how exactly the commands are to be run, so one can expect both parallel and sequential executions (latter in either deterministic or nedeterministic way), so one cannot rely on any precise timing, each heuristic command should be idempotent to each other, and running two identical idempotent commands (possibly in interleave) makes no practical sense if a single failure is enough to claim it a global failure for the particular node (is it like this?). One is still free to exercise a serialized set of actions by enclosing them into a standalone script to be run as this command, though. > (what of course doesn't make too much sense as long as > command is not doing some kind of locking to be able to do it's > internal ordering = magic)? Why would anyone go this crazy if you get the serialization automatically for commands in the non-forking flow of a shell script? Looks like this whole thing needs some reality-checking. re [comment 19]: Missed an iteration in "add" case in the proposed syntax, after fix: pcs quorum device update [<generic options>] [model <model options>] [heuristics <heuristics options>] [{ <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] | drop <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] | replace <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] with <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...]}] re [comment 18]: > (In reply to Jan Pokorný from comment #14) >>> So in future qdevice-tool will display more detailed statistics like (this is >>> example, precise format yet to define): >>> >>> Exec Result (5 sec ago) Result (10 sec ago) Result (25 sec ago) >>> ---- ------------------ ------------------- ------------------- >>> test Pass Pass Fail >>> foo Fail Pass Unknown >>> ping Unknown Pass Unknown >> >> Ah, I see, this was an untold story so far. >> >> Still, some would prefer the command-line shown in full, especially when > > That is expected to be somewhere. You are now talking about this key-derived identifier, right? I'd say, not necessarily, definitely not necessary for only a single command configured. >>> checking the output only in case of troubles, it would help to uncover >>> stale ping target no longer and what not right away. > > Output of command is nice to have for sure. Oh, I meant checking output of qdevice-tool (or the like). Also noticed that one could, in theory, use "heuristics_use_execvp=on" in options passed directly to corosync-qdevice daemon, which restores the system(3)-like behaviour wrt. $PATH traversal, and would pose a risk of legitimately entered unqualified (and only unqualified) drop/replace/with/exec_*=* binaries being incidentally misinterpreted in the processing of the proposed syntax [comment 21], but that would just mangle the heuristic commands configured and could be prevented: - first and foremost, by not changing that "advanced setting", because it "shouldn't be generally used because most of the options are not safe to change" as advised in the man page - by sticking with using exec_foo=bar syntax - by keeping the sane full-qualification of the executables to run, otherwise one can get screwed up easily when adding something into /usr/local/{s,}bin (in a default systemd-inited deployment) anyway - by testing the procedure on the staging cluster first, more so when already skating on thin ice (when enabling that option on) re [comment 21]: Last fix, s/raw-command-or-exec-assignment/raw-command-or-exec-id/ at some places, hence hopefully a final proposal: pcs quorum device update [<generic options>] [model <model options>] [heuristics <heuristics options>] [{ <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...] | drop <raw-command-or-exec-id> [<raw-command-or-exec-id>...] | replace <raw-command-or-exec-id> [<raw-command-or-exec-id>...] with <raw-command-or-exec-assignment> [<raw-command-or-exec-assignment>...]}] (In reply to Jan Pokorný from comment #20) > re [comment 18]: > > >>> So how exactly should be the name part autogenerated? > >> > >> Excerpt from the command's digest, for instance. > > > > Ugh. Digest may be ok for computer but I don't see it as a good name > > for "user" to use. I just don't believe there is really majority of > > people who likes names like > > 24f77cd2b40d037670548cc08031c36b0269b6c6dc6cc5a9d78be2193cae181a495fb15307c099b8fb6cf352a4cce86276193550b5ece69ce1468997e35281a1 > > Ugh. "Excerpt" word must have been missed, just a tiny part of the real > digest was meant, and it can be base64-encoded form of the raw digest > bytes. Hence rather something like exec_yOhm (for "/usr/bin/ping -qc1 > localhost"). Still don't think exec_yOhm is better than exec_ping_localhost. > > > (this is real command, are you able to find out what it is?) more than > > ping (or ping_localhost). > > That's what I'd use detailed heuristic commands listing for, but would > rather prefer if raw commands where the default when facing the user in > the first place :) > > Even in the logs it would be preferable to have the commands shown in > full, for auditability reasons (to know what got executed when, as the > configuration can change in time while perhaps the identifiers can stay > the same). > > > (FTR, this is where docker/moby friendliness excel -- need to uniquely > identify the running container instance? we won't bother the user to > crack her head figuring out more or less transient identifier, we'll > just combine two easy-to-remember wordlist items, adjective + well > known scientist IIRC, and the user will then figure out this artificial > identifier easily when required as an input to other commands; > IMHO, we should strive for these simplifications in user interface > as well where it makes sense, getting beyond throwing the minuscule > internal details on the user at any opportunity) > > * * * > > >> except the inambiguity constraint is likely to be relaxed here as > >> defining two exactly same heuristics does make no sense and should be > >> marked an error by the configuration tool. Then, if you'd need to > >> drop particular heuristic command, you could either provide it in > >> full (style of iproute2's address removal from the interface) or look > >> the "index" up by the means of detailed list command. > > > > Full command is not that great idea because what if there are two > > equal commands > > Again, it was likely missed that exactly the same commands defined > as heuristics should be actively prevented on creation, and the > replace/drop logic should be ready to perform a deduplication when > such a case witnessed (e.g. coming from a manual edit). I don't see any reason for doing this. > > My reasoning behind that is the documention, corosync-qdevice(8), does > not specify (amongst other things) how exactly the commands are to be > run, so one can expect both parallel and sequential executions (latter > in either deterministic or nedeterministic way), so one cannot rely on Yep, agree, this is missing bit of information. I'll add such node. > any precise timing, each heuristic command should be idempotent to each > other, and running two identical idempotent commands (possibly in > interleave) makes no practical sense if a single failure is enough to > claim it a global failure for the particular node (is it like this?). > One is still free to exercise a serialized set of actions by enclosing > them into a standalone script to be run as this command, though. > > > (what of course doesn't make too much sense as long as > > command is not doing some kind of locking to be able to do it's > > internal ordering = magic)? > > Why would anyone go this crazy if you get the serialization > automatically for commands in the non-forking flow of a shell script? I don't know and I don't care. I'm giving user the tool. If they use tool to cut their finger, it's their problem. > > > Looks like this whole thing needs some reality-checking. Last sentence is really only part where I can fully agree. Let customers/gss/qa test it, and will see what they think. It's TechPreview so we still have space to do incompatible changes. re [comment 25]: > Yep, agree, this is missing bit of information. I'll add such node. Appreciated! If the user is to make any more sophisticated sequences, "the mode of executing" is indeed missing. On the other hand, it might be good to consider, if for instance the commands cannot be run in parallel (if not already) in the future and put a future-anticipating warning about the expected non-interference amongst the commands. What's also missing is the most important bit -- how will the commands influence the decisions (what is failure and if single failure wins). > I don't know and I don't care. I'm giving user the tool. Yes, it's understandable from "I am myself curious what use cases will the userbase come up with" (as opposed to "our mission is to guide the user through the cluster configuration management", which is the only surface I am scratching here) POV. > Let customers/gss/qa test it, and will see what they think. > It's TechPreview so we still have space to do incompatible > changes. Except this development mode is upside-down, no upstream first that could have something to say on the usability before it gets settled, and once established syntax is really hard to change. (In reply to Jan Pokorný from comment #26) > re [comment 25]: > > > Yep, agree, this is missing bit of information. I'll add such node. > > Appreciated! If the user is to make any more sophisticated sequences, > "the mode of executing" is indeed missing. On the other hand, it might > be good to consider, if for instance the commands cannot be run in > parallel (if not already) in the future and put a future-anticipating > warning about the expected non-interference amongst the commands. > > What's also missing is the most important bit -- how will the commands > influence the decisions (what is failure and if single failure wins). From the man page: " When all commands finish successfully (their return error code is zero) on time, heuristics have passed, otherwise they have failed. " How exactly it influences the algorithm is described in "algorithm" section. > > > I don't know and I don't care. I'm giving user the tool. > > Yes, it's understandable from "I am myself curious what use cases > will the userbase come up with" (as opposed to "our mission is to > guide the user through the cluster configuration management", which > is the only surface I am scratching here) POV. Nice try, but just pure fallacy. > > > Let customers/gss/qa test it, and will see what they think. > > It's TechPreview so we still have space to do incompatible > > changes. > > Except this development mode is upside-down, no upstream first that > could have something to say on the usability before it gets settled, > and once established syntax is really hard to change. Actually not. When I say customers, I mean not only downstream customers (so RHEL/Fedora) but also upstream customers (= users of corosync) Re [comment 27] > " > When all commands finish successfully > (their return error code is zero) on time, heuristics have passed, otherwise > they have failed. > " Sorry, must have overlooked it this time (was looking just at exec_NAME now). Have seen that before. > How exactly it influences the algorithm is described in "algorithm" > section. Fair enough and good to know. The other missing bit is circumstances under which the command gets executed -- are environment variables passed over, what are the privileges of the process launched, stuff like that. >>> I don't know and I don't care. I'm giving user the tool. >> >> Yes, it's understandable from "I am myself curious what use cases >> will the userbase come up with" (as opposed to "our mission is to >> guide the user through the cluster configuration management", which >> is the only surface I am scratching here) POV. > > Nice try, but just pure fallacy. I think configuration tool paying attention to what is being configured is the Right Thing, otherwise it's the fallacy to think the tool will do substantially better (in form of some assistance) than manual edit + propagate process. Re [comment 27] > " > When all commands finish successfully > (their return error code is zero) on time, heuristics have passed, otherwise > they have failed. > " Sorry, must have overlooked it this time (was looking just at exec_NAME now). Have seen that before. > How exactly it influences the algorithm is described in "algorithm" > section. Fair enough and good to know. The other missing bit is circumstances under which the command gets executed -- are environment variables passed over, what are the privileges of the process launched, stuff like that. >>> I don't know and I don't care. I'm giving user the tool. >> >> Yes, it's understandable from "I am myself curious what use cases >> will the userbase come up with" (as opposed to "our mission is to >> guide the user through the cluster configuration management", which >> is the only surface I am scratching here) POV. > > Nice try, but just pure fallacy. I think configuration tool paying attention to what is being configured is the Right Thing, otherwise it's the fallacy to think the tool will do substantially better (in form of some assistance) than manual edit + propagate process. Re [comment 27] > " > When all commands finish successfully > (their return error code is zero) on time, heuristics have passed, otherwise > they have failed. > " Sorry, must have overlooked it this time (was looking just at exec_NAME now). Have seen that before. > How exactly it influences the algorithm is described in "algorithm" > section. Fair enough and good to know. The other missing bit is circumstances under which the command gets executed -- are environment variables passed over, what are the privileges of the process launched, stuff like that. >>> I don't know and I don't care. I'm giving user the tool. >> >> Yes, it's understandable from "I am myself curious what use cases >> will the userbase come up with" (as opposed to "our mission is to >> guide the user through the cluster configuration management", which >> is the only surface I am scratching here) POV. > > Nice try, but just pure fallacy. I think configuration tool actually ghpaying attention to what is being configured is the Right Thing, otherwise it's the fallacy to think this configurator will do substantially better (in form of some assistance) than manual edit + propagate process. >>> Let customers/gss/qa test it, and will see what they think. >>> It's TechPreview so we still have space to do incompatible >>> changes. >> >> Except this development mode is upside-down, no upstream first that >> could have something to say on the usability before it gets settled, >> and once established syntax is really hard to change. > > Actually not. When I say customers, I mean not only downstream customers (so > RHEL/Fedora) but also upstream customers (= users of corosync) But corosync doesn't need to worry about incompatible changes that much, as the configuration tool can abstract a lot from the actual low-level form. To summarize my points from above discussion concisely for posterity (my last entry here): I claim the point of the configuration tool is to make that very act substantially easier and less error-prone than doing it either by hand or with a trivial custom automation. As I identified some points that could be done and are not (below), I revealed that there's very little value added with the tool on top of providing unified facade also for configuring this new subdomain of the overall space: 1. identifiers for the heuristic commands are pedantically required even when user could do without them easily in this case, especially when configuring just a single heuristic command (see final proposal of the syntax in [comment 24] + ambiguity discussing bottom of [comment ]9) + heuristics_use_execvp discussing [comment 23]) 2. no checking for nonsensical redundancy in heuristic commands and preventing them (unless overridden with --force?, see [comment 20], section starting "except the inambiguity") 3. no guarding for fully qualified executables being specified (unless overridden with --force?; corollary of [comment 19], "First branch" part, and [comment 23]) 4. no action (informative and/or effective drop of heuristics part) happening upon target state becoming "no commands defined" -- user should at least be informed no heuristic commands remain (are configured initially), hence heuristics as such are out of any quorum equations (for now) (in my now rather subjective view, heuristics is a weak entity, i.e., no constituting commands = no heuristics block at all, on creation/update, which would also mandate to have at least a single command given when configuring heuristics from scratch, but that's also reasonable in my eyes) All these could be uncovered with proper analytical review. And without such an assistance (based on intimate knowledge of the subject under control) provided, the tool remains just a dummy forwarder of the master's orders. I admit it's able to do remarkably better at other places, but here, it's simply not cutting it (and sadly, I don't believe QEs would ever figure out these, or attempted to reach any domain-specific critical thinking for that matter, on their own, but that's another story). After fix: [root@rhel75-node1 ~]# rpm -q pcs pcs-0.9.162-1.el7.x86_64 > create quorum device [root@rhel75-node1 ~]# pcs qdevice setup model net --start Quorum device 'net' initialized Starting quorum device... quorum device started > add quorum device with heuristics into the cluster [root@rhel75-node1 ~]# pcs quorum device add model net host=rhel75-node1 algorithm=ffsplit heuristics 'exec_ping=/usr/bin/ping -1 -c 1 "127.0.0.1"' Setting up qdevice certificates on nodes... rhel75-node1: Succeeded rhel75-node2: Succeeded Enabling corosync-qdevice... rhel75-node1: not enabling corosync-qdevice: corosync is not enabled rhel75-node2: not enabling corosync-qdevice: corosync is not enabled Sending updated corosync.conf to nodes... rhel75-node1: Succeeded rhel75-node2: Succeeded Corosync configuration reloaded Starting corosync-qdevice... rhel75-node1: corosync-qdevice started rhel75-node2: corosync-qdevice started > verify that it has been setup correctly [root@rhel75-node2 ~]# pcs quorum config Options: Device: votes: 1 Model: net algorithm: ffsplit host: rhel75-node1 Heuristics: exec_ping: /usr/bin/ping -1 -c 1 "127.0.0.1" > update heuristics options [root@rhel75-node2 ~]# pcs quorum device update heuristics exec_ping= "exec_ls= /usr/bin/test -f /tmp/test" Sending updated corosync.conf to nodes... rhel75-node1: Succeeded rhel75-node2: Succeeded Corosync configuration reloaded Reloading qdevice configuration on nodes... rhel75-node2: corosync-qdevice stopped rhel75-node1: corosync-qdevice stopped rhel75-node2: corosync-qdevice started rhel75-node1: corosync-qdevice started > verify changes [root@rhel75-node1 ~]# pcs config Cluster Name: rhel75-cluster Corosync Nodes: rhel75-node1 rhel75-node2 Pacemaker Nodes: rhel75-node1 rhel75-node2 Resources: Stonith Devices: Resource: fencing (class=stonith type=fence_xvm) Operations: monitor interval=60s (fencing-monitor-interval-60s) Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Ticket Constraints: Alerts: No alerts defined Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-name: rhel75-cluster dc-version: 1.1.18-5.el7-1a4ef7d180 have-watchdog: false Quorum: Options: Device: votes: 1 Model: net algorithm: ffsplit host: rhel75-node1 Heuristics: exec_ls: /usr/bin/test -f /tmp/test > remove qdevice heuristics [root@rhel75-node1 ~]# pcs quorum device heuristics remove Sending updated corosync.conf to nodes... rhel75-node1: Succeeded rhel75-node2: Succeeded Corosync configuration reloaded Reloading qdevice configuration on nodes... rhel75-node2: corosync-qdevice stopped rhel75-node1: corosync-qdevice stopped rhel75-node2: corosync-qdevice started rhel75-node1: corosync-qdevice started [root@rhel75-node1 ~]# pcs quorum config Options: Device: votes: 1 Model: net algorithm: ffsplit host: rhel75-node1 (In reply to Jan Pokorný from comment #29) > 4. no action (informative and/or effective drop of heuristics part) > happening upon target state becoming "no commands defined" > -- user should at least be informed no heuristic commands remain > (are configured initially) Really? # pcs quorum device add model net host=rh74-node3 algorithm=ffsplit heuristics mode=on Warning: No exec_NAME options are specified, so heuristics are effectively disabled {snip} # pcs quorum device update heuristics mode=sync Warning: No exec_NAME options are specified, so heuristics are effectively disabled {snip} re [comment 32]: Sorry, you are right and I am wrong here, though I have surely tested the command, likely the eyes did not spot what they should. Warnings are already present. But this is an interesting behaviour, related to the possibility to have something executed despite configuring something entirely else, for sure: $ pcs quorum device update model \ "host=$(printf \ "localhost\n}\nheuristics {\nexec_bar: /usr/bin/sh -c 'echo reboot>>/root/.profile||:'\n") A bit worrisome enablement of unintended propagation, possibly from higher management layer, likely pre-existing, but only now with new interpretation of heuristics in corosync becoming serious. As serious as asking for quick fix. Anyway, when I temporarily violated the silence already, re [comment 29], item 1.: - s/bottom of [comment ]9)/bottom of [comment 19]/ - insert a subidea of having automatic identifier generated as exec_<limited substring of the command's basename>_<short hash> Howgh. (In reply to Jan Pokorný from comment #33) > But this is an interesting behaviour, related to the possibility > to have something executed despite configuring something entirely else, > for sure: > > $ pcs quorum device update model \ > "host=$(printf \ > "localhost\n}\nheuristics {\nexec_bar: /usr/bin/sh -c 'echo > reboot>>/root/.profile||:'\n") > > A bit worrisome enablement of unintended propagation, possibly from > higher management layer, likely pre-existing, but only now with new > interpretation of heuristics in corosync becoming serious. > As serious as asking for quick fix. This is indeed an issue. However, anybody who can set "host" can set "exec_bar" as well. So even once checks for {}:\n characters are implemented in pcs, it will be still possible to do: $ pcs quorum device update heuristics "exec_bar=/usr/bin/sh -c 'echo reboot>>/root/.profile||:'" Filed as bz1515193. [comment 35]: While I agree this lack of input sanitization/command injection is not CVE-worthy, for the reasons mentioned in the first sentence, it would be shortsighted to dismiss an apparent risk[*1][*2], so my hopes are the prioritization will reflect that, preferably this is reconciled in the same point version as when the support for heuristics is delivered. [*1] Because pcs does not support any kind of access control granularity beside pacemaker config/CIB proper, administrators may resort to conditionally allow auxiliary operators configure some aspects of corosync.conf and preventing others through pcs by the means of sudo + wildcard white/black-listing in sudoers -- and pcs allowing for such configuration injections will, in turn, subvert the reasonable and security and sanity assumptions. Which is bad. [*2] Higher levels of configuration management building upon pcs may likewise count of pcs being secure and sane to guarantee their own safety. Hence, pcs needs to be robust to cater all such use cases gracefully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0866 |