Bug 2028902

Summary: corosync-qdevice not being automatically enabled when declaring a quorum device in RHEL 8.5 cluster
Product: Red Hat Enterprise Linux 8 Reporter: Javier Blanco <jablanco>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: low Docs Contact:
Priority: high    
Version: 8.5CC: cluster-maint, gianluca.cecchi, idevat, kmalyjur, mlisik, mmazoure, mpospisi, nhostako, omular, sbradley, svalasti, tojeline
Target Milestone: rcKeywords: EasyFix, Regression, Triaged
Target Release: 8.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: pcs-0.10.12-2.el8 Doc Type: Bug Fix
Doc Text:
Cause: User adds a quorum device into a cluster. Consequence: Pcs doesn't configure cluster nodes to start quorum device daemon on boot causing the nodes not to communicate with the quorum device. Fix: Fix a bug which prevented pcs to correctly detect if corosync is configured to start on boot. Result: When adding a quorum device to a cluster, nodes are properly configured to start quorum device on boot.
Story Points: ---
Clone Of:
: 2032473 (view as bug list) Environment:
Last Closed: 2022-05-10 14:50:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2032473    

Description Javier Blanco 2021-12-03 17:14:18 UTC
Description of problem:

When declaring a quorum device in an RHEL 8.5 cluster, corosync-qdevice systemd service is not automatically enabled as happens on 8.4.

Version-Release number of selected component (if applicable): Red Hat Enterprise Linux 8.5

How reproducible:

Steps to Reproduce:
1. Deploy a basic two-node cluster
2. Deploy a basic quorum device in another system
3. Authenticate the cluster pcsd daemon on the quorum device machine:

# pcs host auth host.example.net

4. Declare the quorum device on the cluster:

# pcs quorum device add model net host=host.example.net algorithm=lms

Actual results:

[root@fastvm-rhel-8-5-138 ~]# pcs quorum device add model net host=fastvm-rhel-8-4-137 algorithm=lms
Setting up qdevice certificates on nodes...
fastvm-rhel-8-5-138: Succeeded
fastvm-rhel-8-5-139: Succeeded
Enabling corosync-qdevice...                                                  
fastvm-rhel-8-5-139: not enabling corosync-qdevice: corosync is not enabled   --> HERE
fastvm-rhel-8-5-138: not enabling corosync-qdevice: corosync is not enabled   --> HERE
Sending updated corosync.conf to nodes...
fastvm-rhel-8-5-138: Succeeded
fastvm-rhel-8-5-139: Succeeded
fastvm-rhel-8-5-138: Corosync configuration reloaded
Starting corosync-qdevice...
fastvm-rhel-8-5-138: corosync-qdevice started
fastvm-rhel-8-5-139: corosync-qdevice started
[root@fastvm-rhel-8-5-138 ~]# 
[root@fastvm-rhel-8-5-138 ~]# systemctl is-enabled corosync-qdevice
disabled                                                                      --> HERE
[root@fastvm-rhel-8-5-138 ~]# 

Expected results:

[root@fastvm-rhel-8-4-135 ~]# pcs quorum device add model net host=fastvm-rhel-8-4-137 algorithm=lms
Setting up qdevice certificates on nodes...
fastvm-rhel-8-4-135: Succeeded
fastvm-rhel-8-4-136: Succeeded
Enabling corosync-qdevice...
fastvm-rhel-8-4-135: corosync-qdevice enabled                                 --> HERE
fastvm-rhel-8-4-136: corosync-qdevice enabled                                 --> HERE
Sending updated corosync.conf to nodes...
fastvm-rhel-8-4-135: Succeeded
fastvm-rhel-8-4-136: Succeeded
fastvm-rhel-8-4-135: Corosync configuration reloaded
Starting corosync-qdevice...
fastvm-rhel-8-4-135: corosync-qdevice started
fastvm-rhel-8-4-136: corosync-qdevice started
[root@fastvm-rhel-8-4-135 ~]# 
[root@fastvm-rhel-8-4-135 ~]# systemctl is-enabled corosync-qdevice
enabled                                                                       --> HERE
[root@fastvm-rhel-8-4-135 ~]# 

Additional info:

n/a

Comment 4 Miroslav Lisik 2021-12-16 16:19:43 UTC
DevTestResults:

[root@r8-node-01 ~]# rpm -q pcs
pcs-0.10.12-2.el8.x86_64


### setup qdevice on a node which is not a cluster member

[root@r8-node-03 ~]# dnf -yq install corosync-qnetd

Installed:
  corosync-qnetd-3.0.1-1.el8.x86_64                                                                                      nss-tools-3.67.0-7.el8_5.x86_64

[root@r8-node-03 ~]# pcs qdevice setup model net --enable --start
Quorum device 'net' initialized

quorum device enabled
Starting quorum device...
quorum device started

[root@r8-node-03 ~]# systemctl show corosync-qnetd -p ActiveState -p UnitFileState
ActiveState=active
UnitFileState=enabled

### add qdevice to the enabled cluster

[root@r8-node-01 ~]# dnf -yq install corosync-qdevice

Installed:

  corosync-qdevice-3.0.1-1.el8.x86_64                                                                                     nss-tools-3.67.0-7.el8_5.x86_64

[root@r8-node-01 ~]# pcs cluster enable --all
r8-node-01: Cluster Enabled
r8-node-02: Cluster Enabled
[root@r8-node-01 ~]# pcs host auth -u hacluster -p password r8-node-03
r8-node-03: Authorized
[root@r8-node-01 ~]# pcs quorum device add model net host=r8-node-03 algorithm=lms
Setting up qdevice certificates on nodes...
r8-node-01: Succeeded
r8-node-02: Succeeded
Enabling corosync-qdevice...
r8-node-01: corosync-qdevice enabled
r8-node-02: corosync-qdevice enabled
Sending updated corosync.conf to nodes...
r8-node-01: Succeeded
r8-node-02: Succeeded
r8-node-01: Corosync configuration reloaded
Starting corosync-qdevice...
r8-node-01: corosync-qdevice started
r8-node-02: corosync-qdevice started

### check corosync-qdevice is enabled

[root@r8-node-01 ~]# systemctl show corosync-qdevice -p ActiveState -p UnitFileState
ActiveState=active
UnitFileState=enabled

Comment 10 errata-xmlrpc 2022-05-10 14:50:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:1978