Bug 1667053

Summary: pcs should check that corosync node addresses match the value of ip_version option
Product: Red Hat Enterprise Linux 8 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 8.0CC: cfeist, cluster-maint, idevat, mlisik, nhostako, omular, tojeline
Target Milestone: rc   
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.10.1-6.el8 Doc Type: Bug Fix
Doc Text:
Cause: User enters IPv4 addresses when ip_version is set to ipv6 or they enter IPv6 addresses when ip_version is set to ipv4 in pcs cluster setup command Consequence: Pcs creates a cluster which is unable to start due to addresses not matching specified ip_version. Fix: Pcs checks that addresses match specified ip_version and exits with an error in case of mismatch. Result: Instead of creating a cluster which cannot start, pcs exits with an explanatory error message.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-05 20:39:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1674005, 1682129    
Bug Blocks:    
Attachments:
Description Flags
proposed fix + tests none

Description Tomas Jelinek 2019-01-17 10:30:38 UTC
Description of problem:
Currently, pcs does not validate node addresses with respect to the value od ip_version option. This may lead to a situation where corosync does not start.


Version-Release number of selected component (if applicable):
pcs-0.10.1-2.el8


How reproducible:
always, easily


Steps to Reproduce:
[root@rh80-node3:~]# pcs cluster setup test rh80-node3 addr=192.168.122.203 transport knet ip_version=ipv6
Destroying cluster on hosts: 'rh80-node3'...
rh80-node3: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'rh80-node3'
rh80-node3: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'rh80-node3'
rh80-node3: successful distribution of the file 'corosync authkey'
rh80-node3: successful distribution of the file 'pacemaker authkey'
Synchronizing pcsd SSL certificates on nodes 'rh80-node3'...
rh80-node3: Success
Sending 'corosync.conf' to 'rh80-node3'
rh80-node3: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.


Actual results:
[root@rh80-node3:~]# systemctl start corosync
Job for corosync.service failed because the control process exited with error code.                                                                                                                                                                                             
See "systemctl status corosync.service" and "journalctl -xe" for details.                                                                                                                                                                                                       
[root@rh80-node3:~]# systemctl status corosync                                                                                                                                                                                                                                
● corosync.service - Corosync Cluster Engine                                                                                                                                                                                                                                    
   Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)                                                                                                                                                                                 
   Active: failed (Result: exit-code) since Thu 2019-01-17 11:22:28 CET; 6s ago                                                                                                                                                                                                 
     Docs: man:corosync                                                                                                                                                                                                                                                         
           man:corosync.conf                                                                                                                                                                                                                                                    
           man:corosync_overview                                                                                                                                                                                                                                                
  Process: 1505 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=8)                                                                                                                                                                                       
 Main PID: 1505 (code=exited, status=8)                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                
Jan 17 11:22:28 rh80-node3 systemd[1]: Starting Corosync Cluster Engine...                                                                                                                                                                                                      
Jan 17 11:22:28 rh80-node3 corosync[1505]:   [MAIN  ] Corosync Cluster Engine ('3.0.0'): started and ready to provide service.                                                                                                                                                  
Jan 17 11:22:28 rh80-node3 corosync[1505]:   [MAIN  ] Corosync built-in features: dbus systemd xmlconf snmp pie relro bindnow                                                                                                                                                   
Jan 17 11:22:28 rh80-node3 corosync[1505]:   [MAIN  ] failed to parse bindnet address '192.168.122.203'                                                                                                                                                                         
Jan 17 11:22:28 rh80-node3 corosync[1505]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1351.                                                                                                                                                             
Jan 17 11:22:28 rh80-node3 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a                                                                                                                                                                         
Jan 17 11:22:28 rh80-node3 systemd[1]: corosync.service: Failed with result 'exit-code'.                                                                                                                                                                                        
Jan 17 11:22:28 rh80-node3 systemd[1]: Failed to start Corosync Cluster Engine.


Expected results:
Pcs does not create a cluster and the cluster setup command fails with an error saying node addresses are not compatible with the value of ip_version option.

Comment 4 Tomas Jelinek 2019-03-01 10:48:50 UTC
Created attachment 1539784 [details]
proposed fix + tests

After fix:
[root@rh80-node1:~]# pcs cluster setup test rh80-node3 addr=192.168.122.203 transport knet ip_version=ipv6
Error: Address '192.168.122.203' cannot be used in link '0' because the link uses IPv6 addresses
Error: Errors have occurred, therefore pcs is unable to continue
[root@rh80-node1:~]# echo $?
1

Comment 5 Ondrej Mular 2019-05-02 12:04:22 UTC
After fix:
[root@rhel81-node1 ~]# rpm -q pcs
pcs-0.10.1-6.el8.x86_64

[root@rhel81-node1 ~]# pcs cluster setup test rh81-1 addr=192.168.122.201 transport knet ip_version=ipv6
Error: Address '192.168.122.201' cannot be used in the link because the link uses IPv6 addresses
Error: Errors have occurred, therefore pcs is unable to continue
[root@rhel81-node1 ~]# echo $?
1

Comment 10 errata-xmlrpc 2019-11-05 20:39:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3311