Bug 1305130 - pcs does not effectively check configured vs. supported features (incl. "cluster cib-upgrade" considerations)
pcs does not effectively check configured vs. supported features (incl. "clus...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Tomas Jelinek
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-05 13:13 EST by Jan Pokorný
Modified: 2016-11-01 07:40 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-01 07:40:06 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Pokorný 2016-02-05 13:13:22 EST
Pcs should become validation-schema-aware.

This does not necessarily imply validation per se, it may be just
schema version that is used.

Or just Pacemaker version might be enough (except for item 3.), as it
implies the features on its own (and in full, as opposed to schema),
so I should rather call the requested feature
"pcs should be multi-version-aware".


That means:

0. preferrably make pcs handle all the plain configuration expressions
   implied by the latest pacemaker version this pcs is supposed to manage
   to the full extent (it's entirely expected that configuration
   possibilities will increase in time much beyond that, this is about
   point-in-time/features sync), currently it's lacking
   (e.g., [bug 1305049])

1. for existing cluster, track latest version supported by all the nodes,
   for cluster in preparation, resort to minimal default (1.2?) until
   the versions are known

2. do not allow configuration not supported by the current version
   to be requested by the user (unless overridden) incl. explanation
   (backward compatibility)

3. for cib-upgrade, become smarter: do not allow upgrade if nodes do not
   have a single highest version of the schema in common (at least per
   the major version?)

+ possibly combine 2. + 3. -> if the requested feature is not supported by
  the current schema but cib-upgrade would solve it AND is feasible, then
  ask whether to proceed (or suggest it as a first step)

  - or possibly offer it at the first occasion when feasible cib-upgrade
    is detected?  (that may cause annoyances, though)


This is very raw RFE, I am aware, but the longer the version flexibility
is ignored, the worse.
Comment 2 Ken Gaillot 2016-07-11 12:15:10 EDT
FYI the general approach so far has been to run cib-upgrade automatically when someone configures a feature that requires it.

Also, it's a valid use case for someone to use pcs to edit a file, even if that file wouldn't be supported by the installed pacemaker version. They might be preparing for an upgrade, or modifying a configuration from one cluster to use on another. So, I would never deny an unsupported command, but maybe print a warning.

This may be out of scope, but a nice related feature would be the ability to run xmllint on a CIB XML file. Something like "pcs cluster cib-validate <filename> [--config]". It would check the validate-with value, then run "xmllint --noout --relaxng" with the appropriate pacemaker RNG schema and the supplied file. This would enable a workflow like dump cib -> edit cib -> validate -> push. The advantage over just doing cib-push and getting a pass/fail is that xmllint reports *why* a file doesn't validate. The advantage over letting the user run xmllint directly would be choosing the schema file automatically, and handling --config (probably by dumping the live CIB and replacing the configuration section).

That does have some challenges, such as a dependency on xmllint, the need to parse validate-with, and the need to dump the live CIB to handle --config. So, it's understandable if that feature is rejected, especially if few customers edit the CIB directly. But it would make my life easier. ;-)
Comment 3 Jan Pokorný 2016-07-19 09:46:58 EDT
re [comment 2]

> The advantage over just doing cib-push and getting a pass/fail
> is that xmllint reports *why* a file doesn't validate.

Jokes aside, it turns out xmllint/libxml2 is damn poor in reporting
the reasons for validation failure.

That's also why I introduced jing (the "real" root-cause-analysing
validator) at least to optional channel in RHEL 7: [bug 908010].
Comment 4 Jan Pokorný 2016-07-19 12:23:18 EDT
re [comment 3]:

Not to speak about deficiencies when checking data types, here's one
example I've been historically involved with:

https://mail.gnome.org/archives/xml/2016-March/msg00012.html
Comment 5 Tomas Jelinek 2016-10-27 10:37:39 EDT
I did some investigation and with Jan's great help (thanks!) I figured out how the CIB upgrade process works in pacemaker:
1. Pacemaker keeps track of what pacemaker version is running on each node.
2. Pacemaker elects the DC in such a way that DC is always the node with the oldest pacemaker version in a cluster.
3. When the "cibadmin --upgrade" command is run, the request is sent to the DC.
4. DC bumps the CIB schema version to the newest version supported by that very DC and tells other nodes about the change.
5. As a result, the CIB never gets upgraded to a schema version which is not supported by all nodes.

If the CIB upgrade is requested on a file, then there is no communication in a cluster. CIB schema version simply gets bumped to the newest version supported by that particular node.


(In reply to Jan Pokorný from comment #0)
> 0. preferrably make pcs handle all the plain configuration expressions
>    implied by the latest pacemaker version this pcs is supposed to manage
>    to the full extent (it's entirely expected that configuration
>    possibilities will increase in time much beyond that, this is about
>    point-in-time/features sync), currently it's lacking
>    (e.g., [bug 1305049])
This is ongoing, there is a separate bz for each missing feature. I do not see any point in tracking them all in here.

> 1. for existing cluster, track latest version supported by all the nodes,
>    for cluster in preparation, resort to minimal default (1.2?) until
>    the versions are known
This is done in pacemaker, see above.

> 2. do not allow configuration not supported by the current version
>    to be requested by the user (unless overridden) incl. explanation
>    (backward compatibility)
Pcs already does this.
If a feature not supported by the current CIB schema is requested, pcs tries to bump the schema version. If that fails, pcs exits with an error saying the feature is not supported by pacemaker and a pacemaker upgrade is required to use it.
If for some commands / features this is not true, please file a bug with specific reproducer.

> 3. for cib-upgrade, become smarter: do not allow upgrade if nodes do not
>    have a single highest version of the schema in common (at least per
>    the major version?)
This is done in pacemaker, see above.


(In reply to Ken Gaillot from comment #2)
> FYI the general approach so far has been to run cib-upgrade automatically
> when someone configures a feature that requires it.
Correct. I was talking to Chris some time ago and he confirmed this is still the path we want to follow.

> This may be out of scope, but a nice related feature would be the ability to
> run xmllint on a CIB XML file. Something like "pcs cluster cib-validate
> <filename> [--config]". It would check the validate-with value, then run
> "xmllint --noout --relaxng" with the appropriate pacemaker RNG schema and
> the supplied file. This would enable a workflow like dump cib -> edit cib ->
> validate -> push. The advantage over just doing cib-push and getting a
> pass/fail is that xmllint reports *why* a file doesn't validate. The
> advantage over letting the user run xmllint directly would be choosing the
> schema file automatically, and handling --config (probably by dumping the
> live CIB and replacing the configuration section).
> 
> That does have some challenges, such as a dependency on xmllint, the need to
> parse validate-with, and the need to dump the live CIB to handle --config.
> So, it's understandable if that feature is rejected, especially if few
> customers edit the CIB directly. But it would make my life easier. ;-)

How about the existing "pcs cluster verify" command which runs crm_verify behind the scenes? Just be aware of bz1213946.
Comment 6 Ken Gaillot 2016-10-27 15:09:25 EDT
(In reply to Tomas Jelinek from comment #5)
> (In reply to Ken Gaillot from comment #2)
> > This may be out of scope, but a nice related feature would be the ability to
> > run xmllint on a CIB XML file. Something like "pcs cluster cib-validate
> 
> How about the existing "pcs cluster verify" command which runs crm_verify
> behind the scenes? Just be aware of bz1213946.

I can't believe I forgot about that. Yes, sorry, ignore my rambling :-)
Comment 7 Tomas Jelinek 2016-11-01 07:40:06 EDT
As explained in previous comments the functionality requested here is already implemented in either pacemaker or pcs. I'm closing this.

Note You need to log in before you can comment on or make changes to this bug.