Bug 1317573

Summary: coroparse does not understand the config hierarchy
Product: Red Hat Enterprise Linux 7 Reporter: Christine Caulfield <ccaulfie>
Component: corosyncAssignee: Jan Friesse <jfriesse>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: ccaulfie, cfeist, cluster-maint, jkortus, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: corosync-2.3.6-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 06:49:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christine Caulfield 2016-03-14 15:16:25 UTC
Description of problem:

The config parsing code code in coroparse is a state machine, whereas the format of the config file itself is a hierarchy. If an unknown section is entered into the file, the parser can get confused and end up in the wrong part of the hierarchy. 

The way this is 'handled' is for coroparse to have to know every possible thing that can ever go into the config file regardless of whether it needs a custom data type or not. If a single one is missed (or an unsupported one added by accident either by an inexpert user or simply a typo) then coroparse can lose track of where it is in the file state and key entries can be entered into cmap without the correct datatype. meaning they also will not be pulled out again when requested with the right datatype as those types will not match

Version-Release number of selected component (if applicable):
Spotted in 7.2 but it's been there since coroparse.c

How reproducible:
Every time

Steps to Reproduce:
1. Set up a corosync.conf with a section similar to the following:
quorum {
    provider: corosync_votequorum
    expected_votes: 3
    oops {
    }
    wait_for_all: 1
}

2. start corosync
3. corosync-cmapctl | grep quorum

Actual results:
you'll see that quorum.wait_for_all has the type (str) instead of the expected (u8)


Expected results:
wait_for_all (and anything else after an unknown section) should have the correct datatype


Additional info:

Comment 2 Christine Caulfield 2016-04-25 13:33:57 UTC
commit aab55a004bb12ebe78db341dc56759dfe710c1b2
Author: Christine Caulfield <ccaulfie>
Date:   Thu Apr 21 13:47:47 2016 +0100

    parser: Make config file parser more hierarchy aware

Comment 3 Christine Caulfield 2016-06-07 13:06:39 UTC
Also:

commit f2a1fcc5bf2f2bb7e3f89b1304f100bee677dd9b
Author: Christine Caulfield <ccaulfie>
Date:   Fri May 27 11:39:52 2016 +0100

    logconfig: Fix logging reload disabling logfiles

Comment 7 Jaroslav Kortus 2016-09-14 17:21:19 UTC
with valid pcs config:
[root@virt-247 ~]# corosync-cmapctl  | grep -E '^totem'
totem.cluster_name (str) = STSRHTS13333
totem.interface.0.bindnetaddr (str) = virt-247
totem.interface.0.mcastaddr (str) = ff15::296d
totem.interface.0.mcastport (u16) = 5405
totem.ip_version (str) = ipv6
totem.secauth (str) = off
totem.transport (str) = udp
totem.version (u32) = 2

with:
totem { 
    oops { 
    } 
    version: 2
    secauth: off
    cluster_name: STSRHTS13333
    transport: udp
    ip_version: ipv6

}
[root@virt-247 ~]# corosync-cmapctl  | grep -E '^totem'
totem.cluster_name (str) = STSRHTS13333
totem.interface.0.bindnetaddr (str) = virt-247
totem.interface.0.mcastaddr (str) = ff15::296d
totem.interface.0.mcastport (u16) = 5405
totem.ip_version (str) = ipv6
totem.secauth (str) = off
totem.transport (str) = udp
totem.version (u32) = 2


Looks good.
Nesting blocks works as expected, nesting them with conflicting names works as expected too (assuming that (str) is correct default):
totem { 
    oops { 
	oops: 1
    } 
    oops {
        oops_backup: 2
        oops: 4
    }
    version: 2
    secauth: off
    cluster_name: STSRHTS13333
    transport: udp
    ip_version: ipv6

}
[root@virt-247 ~]# corosync-cmapctl  | grep -E '^totem'
totem.cluster_name (str) = STSRHTS13333
totem.interface.0.bindnetaddr (str) = virt-247
totem.interface.0.mcastaddr (str) = ff15::296d
totem.interface.0.mcastport (u16) = 5405
totem.ip_version (str) = ipv6
totem.oops.oops (str) = 4
totem.oops.oops_backup (str) = 2
totem.secauth (str) = off
totem.transport (str) = udp
totem.version (u32) = 2





And now some things that are a bit suspicious:
reported OK:

totem {
    version: 2
    secauth: off
    cluster_name: STSRHTS13333
    transport: udp
    ip_version: ipv6
}}}}}}}}}}

reported not OK:
totem {
    version: 2
    secauth: off
    cluster_name: STSRHTS13333
    transport: udp
    ip_version: ipv6
}
}

Does not look like closing brace missing:
totem {
    version: 2
    secauth: off
    cluster_name: STSRHTS13333
    transport: udp
    ip_version: ipv6
{{}}}
}
# corosync -f
error   [MAIN  ] parser error: Missing closing brace
error   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1230.

There's some issue visible in how braces and newlines combinations are parsed. It might be worth fixing too. I haven't found a note in man page about any mandatory format. In fact constructs like totem { } (one-line) are used there, suggesting you don't have to put newlines before braces to be accepted.

Considering this was acting the same way in previous versions, it is no regression requiring any immediate action. Also, users should modify corosync.conf via pcs and not directly. I'd suggest to fix it though, perhaps next time you run through that code? :)

Marking as verified with corosync-2.4.0-4.el7.x86_64

Comment 8 Jan Friesse 2016-09-15 06:04:38 UTC
Jardo,
nice catch with the multiple braces on one line. Actual behavior is really far from perfect and it can be described like this:

read line:
  Is line empty or comment - continue
  Line contains { character - start new section + continue
  Line contains : character - it's key/value pait + continue
  Line contains } character - start end section

So it's really easy right now to confuse parser and we should really move to something more context-free than what we have now. I don't think it's a good material for Corosync 2.x/RHEL 7, but for sure good for Corosync 3.x, so I've created upstream issue (https://github.com/corosync/corosync/issues/150) just to make sure this bz don't get lost.

Comment 9 Tomas Jelinek 2016-09-15 08:59:08 UTC
Honzo,

Can you let us know when this is fixed so we can update corosync.conf parser in pcs? Parser in pcs currently mimics corosync's parser behavior in order to work with any config that corosync is able to parse.

Thanks.

Comment 10 Jan Friesse 2016-09-15 13:39:34 UTC
(In reply to Tomas Jelinek from comment #9)
> Honzo,
> 
> Can you let us know when this is fixed so we can update corosync.conf parser
> in pcs? 

I'll try not to forget.

> Parser in pcs currently mimics corosync's parser behavior in order
> to work with any config that corosync is able to parse.
> 
> Thanks.

Comment 12 errata-xmlrpc 2016-11-04 06:49:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2463.html