Bug 1659389

Summary: Consider storing resolved IP address into corosync.conf ringX_addr field
Product: Red Hat Enterprise Linux 8 Reporter: Jan Friesse <jfriesse>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED NOTABUG QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: ccaulfie, cfeist, cluster-maint, idevat, kgaillot, michele, mmcgrath, omular, rsteiger, tojeline
Target Milestone: rc   
Target Release: 8.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-14 16:22:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Friesse 2018-12-14 09:01:06 UTC
Description of problem:

As described in bug 1659269 and bug 1654630 corosync has quite a few problems with reliable resolve of hostname. We now should have a quite good solution which will work most of the time, but it's still not perfect a may require changes in corosync.conf.

Basically there are two problems which are in contradiction:
- Corosync doesn't force ip order - but then we hit bug 1654630
- Corosync force ip order - then we hit bug 1659269

Pseudo solution would be to define order as a ip4 then ip6 (because most of the people use ipv4 as a primary ip family), but then ip6 setups will be hit.

PCS cluster setup has a big advantage of being executed on fully booted system, so it doesn't hit the bug 1654630. It can also check that list of all nodes IPs is equal on all nodes (or at least this was my impression on Monday 10.12.2008 call). This also means that pcs is already doing resolving.

Having said all of above, I would like to ask you to consider storing resolved IPs in corosync.conf for ringX_addr fields (of course with option to disable this new default behavior and store hostname).


Actual results:
# pcs cluster setup test node_name

corosync.conf:

    node {
        ring0_addr: node_name
        name: node_name
        nodeid: 1
    }

Expected results:

corosync.conf:

    node {
        ring0_addr: node_ip
        name: node_name
        nodeid: 1
    }

Comment 1 Jan Friesse 2018-12-14 09:03:53 UTC
(typo fix: Monday 10.12.2018, 10.12.2008 was Wednesday and no pcs existed ;) )

Comment 2 Tomas Jelinek 2019-01-07 12:15:56 UTC
*** Bug 1659269 has been marked as a duplicate of this bug. ***

Comment 4 Tomas Jelinek 2019-01-08 12:21:22 UTC
Proposed solution draft:

1) Cluster setup:
* If no address is specified for a node in 'cluster setup' command, use an address specified for that node in 'host auth' command. (no change here with respect to current state)
* If such address is not an IP address, resolve it on all future cluster nodes. From the resulting IP addresses pick one which is common for all nodes. If no such IP is found, report an error.
* The IP address obtained from the resolving will be either IPv6 or IPv4 depending on other nodes' addresses (specified and default) and ip_version option if specified. If pcs is unable to find IPs for nodes so that a single IP family is used in a ring, then pcs exits with an appropriate error.

2) Node add:
Similar to 'cluster setup' except the required IP family is determined by existing cluster nodes. If an address of the required family cannot be found for a new node, pcs will exit with an appropriate error.

Comment 8 Tomas Jelinek 2019-01-14 16:22:59 UTC
The issue has been resolved at corosync level in bz1665211. There is no need to address this in pcs any longer.