Bug 1213946

Summary: "pcs cluster verify" does not work correctly with the option filename argument
Product: Red Hat Enterprise Linux 7 Reporter: Patrik Hagara <phagara>
Component: pcsAssignee: Ivan Devat <idevat>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 7.1CC: cfeist, cluster-maint, idevat, rhayden, rsteiger, sochotni, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.160-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 15:37:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proposed fix none

Description Patrik Hagara 2015-04-21 15:21:09 UTC
Description of problem:
`pcs cluster verify ~/my_cib.xml` queries the cib of running cluster instead of reading provided file.

The help output of `pcs cluster --help` states that:
    verify [-V] [filename]
        Checks the pacemaker configuration (cib) for syntax and common
        conceptual errors.  If no filename is specified the check is
        performmed on the currently running cluster.  If '-V' is used
        more verbose output will be printed


Version-Release number of selected component (if applicable):
pcs-0.9.137-13.el7.x86_64


How reproducible:
always


Steps to Reproduce:
1. run `pcs cluster verify ~/my_cib.xml` on existing cib file


Actual results:
[root@virt-016 ~]# pcs cluster verify -V ~/my_cib.xml
Live CIB query failed: Transport endpoint is not connected

Error: unable to get cib
Error: unable to get cib


Expected results:
reads and validates cib from provided file


Additional info:
none

Comment 6 rhayden 2016-03-16 21:59:36 UTC
I can confirm that this issue occurs on RHEL 7.2 as well.

[root]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)
[root]# pcs cluster cib /tmp/cib.out --config
[root]# pcs cluster verify -V /tmp/cib.out
[root]# echo $?
0
[root# pcs cluster stop --all
node2: Stopping Cluster (pacemaker)...
node1: Stopping Cluster (pacemaker)...
node1: Stopping Cluster (corosync)...
node2: Stopping Cluster (corosync)...
[root]# pcs cluster verify -V /tmp/cib.out
Live CIB query failed: Transport endpoint is not connected

Error: unable to get cib
Error: unable to get cib
[root]# echo $?
1

Comment 7 Tomas Jelinek 2016-10-27 14:34:41 UTC
As a workaround you can use the "-f" pcs option and omit the filename argument.

Comment 9 Ivan Devat 2017-08-08 13:12:40 UTC
`pcs cluster verify` does not ignore the filename argument completely. There are two verifications inside this command and the filename argument is ignored only in the second verification.

> The first verification is without output when the file is valid. The second verification fails because it is not possible to load (live) cib.

[vm-rhel72-1 ~] $ pcs cluster status
Error: cluster is not currently running on this node
[vm-rhel72-1 ~] $ pcs cluster verify cib.good.xml
Error: unable to get cib

> The first verification produces console output when the file is invalid. The second verification fails because it is not possible to load (live) cib.

[vm-rhel72-1 ~] $ pcs cluster verify cib.bad.xml
cib.bad.xml:43: element primitive: Relax-NG validity error : ID webserver redefined
cib.bad.xml:43: element primitive: Relax-NG validity error : Invalid sequence in interleave
cib.bad.xml:43: element primitive: Relax-NG validity error : Element primitive failed to validate content
cib.bad.xml:16: element primitive: Relax-NG validity error : Element resources has extra content: primitive
Errors found during check: config not valid
  -V may provide more details

Error: unable to get cib

> The workaround with -f works slightly different when the cibfile is not valid. The second verification now fails because it is not possible to load an invalid cib from the file.

[vm-rhel72-1 ~] $ pcs cluster verify -f cib.bad.xml
crm_verify: Connection to local file 'cib.bad.xml' failed: Update does not conform to the configured schema
Live CIB query failed: Update does not conform to the configured schema

Error: unable to get cib

> And it is possible to have an ambiguous filename specification:
[vm-rhel72-1 ~] $ pcs cluster verify cib.bad.xml -f cib.good.xml
cib.bad.xml:43: element primitive: Relax-NG validity error : ID webserver redefined
cib.bad.xml:43: element primitive: Relax-NG validity error : Invalid sequence in interleave
cib.bad.xml:43: element primitive: Relax-NG validity error : Element primitive failed to validate content
cib.bad.xml:16: element primitive: Relax-NG validity error : Element resources has extra content: primitive
Errors found during check: config not valid
  -V may provide more details

[vm-rhel72-1 ~] $ echo $?
0

Comment 10 Ivan Devat 2017-09-20 14:59:14 UTC
Created attachment 1328486 [details]
proposed fix

Comment 11 Ivan Devat 2017-10-11 08:22:18 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.160-1.el7.x86_64

[vm-rhel72-1 ~] $ cat cib.good.xml
<cib epoch="559" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" crm_feature_set="3.0.12" update-origin="rh7-3" update-client="crmd" cib-last-written="Thu Aug 23 16:49:17 2012" have-quorum="0" dc-uuid="2">
  <configuration>
    <crm_config/>
    <nodes/>
    <resources>
      <primitive class="ocf" id="R" provider="heartbeat" type="Dummy">
        <operations>
          <op id="R-monitor-interval-10" interval="10" name="monitor" timeout="20"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="xvm-fencing" type="fence_xvm">
        <instance_attributes id="xvm-fencing-instance_attributes">
          <nvpair id="xvm-fencing-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="n1, n2"/>
        </instance_attributes>
        <operations>
          <op id="xvm-fencing-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
    </resources>
    <constraints/>
  </configuration>
  <status/>
</cib>
[vm-rhel72-1 ~] $ cp cib.good.xml cib.bad.xml
[vm-rhel72-1 ~] $ sed -i 's/id="R-monitor-interval-10"/id="R"/' cib.bad.xml

> valid cib

[vm-rhel72-1 ~] $ pcs cluster verify cib.good.xml
[vm-rhel72-1 ~] $ echo $?
0
[vm-rhel72-1 ~] $ pcs cluster verify -f cib.good.xml
[vm-rhel72-1 ~] $ echo $?
0
[vm-rhel72-1 ~] $ pcs cluster verify -f cib.good.xml cib.good.xml
Warning: File '/root/cib.good.xml' specified twice
[vm-rhel72-1 ~] $ echo $?
0
[vm-rhel72-1 ~] $ pcs cluster verify -f cib.good.xml cib.bad.xml
Error: Ambiguous cib filename specification: 'cib.bad.xml' vs  -f 'cib.good.xml'

> invalid cib

[vm-rhel72-1 ~] $ pcs cluster verify cib.bad.xml
Error: invalid cib:
/tmp/tmpXWBTk5.pcs:8: element op: Relax-NG validity error : ID R redefined
Relax-NG validity error : Extra element operations in interleave
/tmp/tmpXWBTk5.pcs:7: element operations: Relax-NG validity error : Element primitive failed to validate content
/tmp/tmpXWBTk5.pcs:6: element primitive: Relax-NG validity error : Element resources has extra content: primitive
Errors found during check: config not valid
  -V may provide more details

> bad fencing topology

[vm-rhel72-1 ~] $ cp cib.good.xml cib.bad.fence.topology.xml
[vm-rhel72-1 ~] $ sed -i 's#</resources>#</resources><fencing-topology><fencing-level devices="FX" index="2" target="node1" id="fl-node1-2"/></fencing-topology>#' cib.bad.fence.topology.xml
[vm-rhel72-1 ~] $ pcs cluster verify cib.bad.fence.topology.xml
Error: Stonith resource(s) 'FX' do not exist
Error: Node 'node1' does not appear to exist in configuration
[vm-rhel72-1 ~] $ echo $?
1

Comment 16 errata-xmlrpc 2018-04-10 15:37:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0866