Bug 1517834
| Summary: | corosync.rng doesn't have all the options available. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Raul Mahiques <rmahique> |
| Component: | clufter | Assignee: | Jan Pokorný [poki] <jpokorny> |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.4 | CC: | aherr, brubisch, cfeist, jfriesse, jpokorny, kgaillot, mlisik |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | clufter-0.77.0-2.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-10 18:48:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1122832 | ||
(In reply to Raul Mahiques from comment #0) typo: s/pacemaker/corosync/ > Description of problem: > /usr/lib/python2.7/site-package/clufter/formats/corosync/corosync.rng > doesn't have all the options available for pacemaker therefore commands such > as pcs config export pcs-commands > fail because it can't validate the config. > > Version-Release number of selected component (if applicable): > clufter-common-0.76.0-1.el7 > > How reproducible: > Always > > Steps to Reproduce: > 1. Add for example: > nodelist { > node { > name: xyz > ..... > > 2. sync the cluster config: pcs sync > 3. The cluster will take the new name, it's a valid parameter, but then if > we want to see the commands that could be used to configure the cluster with > such configuration we will get an error: > pcs config export pcs-commands-verbose > > > > > Actual results: > The validation fails. > > Expected results: > Validation should work > > Additional info: I believe that's not an issue with clufter, but rather with pacemaker expecting that a particular configuration key in corosync tracking of the settings, which was introduced in https://github.com/ClusterLabs/pacemaker/commit/8511b3e118682d43e28765a47bca678ed3a53096#diff-677c4887fedde67f7205539b2231bf95R888 Corosync doesn't appear to be using the particular key in any way, and it's strange for higher level tool to hijack proper configuration management of the lower level tool, especially since pacemaker has configuration management on its own. Hence I believe the respective recommendation item should be dropped from https://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html Using ring0_addr key, which also gets effectively used by corosync, should be enough. and might also avoid unnecessary ambiguities. Adding corosync and pacemaker maintainers for their inputs. Anyway, the workaround is to append --force switch to
"pcs config export pcs-commands{,-verbose}" commands
(or add --nocheck switch when running clufter pcs2pcscmd directly).
@poki: Not sure what exact input you want, so I will try to explain my stand point. corosync.conf is parsed and stored into cmap. Cmap is designed to contain various corosync "internal" data (parsed configuration, auto-generated items, statistics, ...) together with "user" data. We could do very strict checking of all options in the corosync.conf and refuse config file with unknown option. But to make this work we would need to do the same checking also in cmap itself because of consistency. So cmap would then loose ability to contain "user" data. So as you asked, corosync.conf itself is "conceptually expected to be extended arbitrarily" to some extend. On the other hand, I must fully agree pcs shouldn't store (or require) this "user" data. And pacemaker should ether use ring0_addr or we can standardize name (what looks like neat feature anyway). FYI, regarding the name option in particular, pacemaker needs it to connect a node name to a corosync ID when the node name is not the uname and ring0_addr is an IP address, which is a fairly common situation. But I think the central point is that corosync does allow arbitrary options specified by client applications, so it's just not correct to consider these invalid. Wouldn't it make sense then to dedicate extra label, call it "private", to be usable either as a direct key or a subtree root in arbitrary location within the existing public hierarchy used by corosync? Because now, if corosync decides to use nodelist.node.name for its own purpose (sure it has liberty to do so), pacemaker (or whatever other corosync user happening to bet on the unlucky name is basically screwed, marking this a bad design. Indeed, in this case the bilateral agreement [comment 6] would be optimal (even when settled rather late), so take [comment 8] merely as a suggestion for a clean namespace separation in general, then. Clufter (and perhaps also corosync proper might if there's an interest to ressurect https://github.com/corosync/corosync/commits/topic-relax-conf-gen) would certainly benefit from a generic expressability by the means of RelaxNG schema and sadly, static-prefix cannot be expressed, just a full-blown label (like suggested "private") or "anything" (which would work for the elements/attribute of the "private" subtree). re [comment 6]: > [...] or we can standardize name (what looks like neat feature anyway) Ok, I was rather excessively dismissive of this kind of unannounced namespace squatting by external party from corosync's POV (would be limited if there was a right way to integrate, something akin to "private" subtrees per above), but as also explained in [comment 7], identification-guiding fallback otherwise not interfering with the messaging layer itself may come indispensable. -> https://github.com/corosync/corosync/pull/286 When that PR gets resolved, I'll extend corosync.rng carried by clufter respectively. Round trip is complete now with this commit scheduled for the nearest future point release: https://pagure.io/clufter/c/4b406662507b0f1d1343f468a849adab2a243398?branch=next I mean, this commit (fixed a typo in commit message): https://pagure.io/clufter/c/ed02b8477e7d32791020d5f253138d3a60a4de5c?branch=next Actually also modified ring0_addr description per what's been changed in corosync.conf(5) along: https://pagure.io/clufter/c/a75e1456f11725b7a58bc81148a6d6403b2530d2?branch=next To test this:
$ clufter --dist=rhel,7.5 pcs2pcscmd -gqs - {cib2pcscmd.in_format.void_file} <<EOF
totem {
cluster_name: test
transport: udpu
version: 2
}
nodelist {
node {
name: arpeggione
nodeid: 1
ring0_addr: 191.168.121.1
}
node {
nodeid: 2
name: balalaika
ring0_addr: 191.168.121.2
}
node {
name: cythara
nodeid: 3
ring0_addr: 191.168.121.3
}
}
quorum {
provider: corosync_votequorum
}
EOF
BEFORE (something like):
> WARNING:clufter.format:Invalid as per RNG file
> `/usr/lib/python2.7/site-package/clufter/formats/corosync/corosync.rng'
> WARNING:clufter.format:None of the validation attempts succeeded with
> validator spec: /usr/lib/python2.7/site-package/clufter/formats/corosync/corosync.rng
> coroxml-needle: Validation: 0:0:Extra element nodelist in interleave, 0:0:
> Element corosync failed to validate content
AFTER (something like):
> #!/bin/sh
> # sequence generated on 2017-12-01 14:56:19 with: clufter 0.77.1a
> # invoked as: ['./run-dev', '--dist=rhel,7.5', 'pcs2pcscmd', '-gqs',
> # '-', '{cib2pcscmd.in_format.void_file}']
> # targeting system: ('linux', 'redhat', '7.5')
> # using interpreter: CPython 2.7.14
> pcs cluster auth 191.168.121.1 191.168.121.2 191.168.121.3 <> /dev/tty
> pcs cluster setup --name test \
> 191.168.121.1 191.168.121.2 191.168.121.3 --transport udpu
> pcs cluster start --all --wait=60
> pcs cluster cib tmp-cib.xml
> cp tmp-cib.xml tmp-cib.xml.deltasrc
> pcs cluster cib-push tmp-cib.xml diff-against=tmp-cib.xml.deltasrc
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1009 |
Description of problem: /usr/lib/python2.7/site-package/clufter/formats/corosync/corosync.rng doesn't have all the options available for pacemaker therefore commands such as pcs config export pcs-commands fail because it can't validate the config. Version-Release number of selected component (if applicable): clufter-common-0.76.0-1.el7 How reproducible: Always Steps to Reproduce: 1. Add for example: nodelist { node { name: xyz ..... 2. sync the cluster config: pcs sync 3. The cluster will take the new name, it's a valid parameter, but then if we want to see the commands that could be used to configure the cluster with such configuration we will get an error: pcs config export pcs-commands-verbose Actual results: The validation fails. Expected results: Validation should work Additional info: