Cause:
User enters IPv4 addresses when ip_version is set to ipv6 or they enter IPv6 addresses when ip_version is set to ipv4 in pcs cluster setup command
Consequence:
Pcs creates a cluster which is unable to start due to addresses not matching specified ip_version.
Fix:
Pcs checks that addresses match specified ip_version and exits with an error in case of mismatch.
Result:
Instead of creating a cluster which cannot start, pcs exits with an explanatory error message.
Description of problem:
Currently, pcs does not validate node addresses with respect to the value od ip_version option. This may lead to a situation where corosync does not start.
Version-Release number of selected component (if applicable):
pcs-0.10.1-2.el8
How reproducible:
always, easily
Steps to Reproduce:
[root@rh80-node3:~]# pcs cluster setup test rh80-node3 addr=192.168.122.203 transport knet ip_version=ipv6
Destroying cluster on hosts: 'rh80-node3'...
rh80-node3: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'rh80-node3'
rh80-node3: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'rh80-node3'
rh80-node3: successful distribution of the file 'corosync authkey'
rh80-node3: successful distribution of the file 'pacemaker authkey'
Synchronizing pcsd SSL certificates on nodes 'rh80-node3'...
rh80-node3: Success
Sending 'corosync.conf' to 'rh80-node3'
rh80-node3: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
Actual results:
[root@rh80-node3:~]# systemctl start corosync
Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
[root@rh80-node3:~]# systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2019-01-17 11:22:28 CET; 6s ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1505 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)
Main PID: 1505 (code=exited, status=8)
Jan 17 11:22:28 rh80-node3 systemd[1]: Starting Corosync Cluster Engine...
Jan 17 11:22:28 rh80-node3 corosync[1505]: [MAIN ] Corosync Cluster Engine ('3.0.0'): started and ready to provide service.
Jan 17 11:22:28 rh80-node3 corosync[1505]: [MAIN ] Corosync built-in features: dbus systemd xmlconf snmp pie relro bindnow
Jan 17 11:22:28 rh80-node3 corosync[1505]: [MAIN ] failed to parse bindnet address '192.168.122.203'
Jan 17 11:22:28 rh80-node3 corosync[1505]: [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1351.
Jan 17 11:22:28 rh80-node3 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Jan 17 11:22:28 rh80-node3 systemd[1]: corosync.service: Failed with result 'exit-code'.
Jan 17 11:22:28 rh80-node3 systemd[1]: Failed to start Corosync Cluster Engine.
Expected results:
Pcs does not create a cluster and the cluster setup command fails with an error saying node addresses are not compatible with the value of ip_version option.
Created attachment 1539784[details]
proposed fix + tests
After fix:
[root@rh80-node1:~]# pcs cluster setup test rh80-node3 addr=192.168.122.203 transport knet ip_version=ipv6
Error: Address '192.168.122.203' cannot be used in link '0' because the link uses IPv6 addresses
Error: Errors have occurred, therefore pcs is unable to continue
[root@rh80-node1:~]# echo $?
1
After fix:
[root@rhel81-node1 ~]# rpm -q pcs
pcs-0.10.1-6.el8.x86_64
[root@rhel81-node1 ~]# pcs cluster setup test rh81-1 addr=192.168.122.201 transport knet ip_version=ipv6
Error: Address '192.168.122.201' cannot be used in the link because the link uses IPv6 addresses
Error: Errors have occurred, therefore pcs is unable to continue
[root@rhel81-node1 ~]# echo $?
1
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2019:3311