Bug 813330 - Quorum dissolved after performing ntpdate
Summary: Quorum dissolved after performing ntpdate
Keywords:
Status: CLOSED DUPLICATE of bug 738468
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.6
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-17 14:10 UTC by Eugene S
Modified: 2013-02-22 19:49 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-22 19:49:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Eugene S 2012-04-17 14:10:21 UTC
Description of problem:
Following error message appear after performing ntpdate command on cluster machine:
[root@hsmsc50sfe1a ~]# 
Message from syslogd@hsmsc50sfe1a at Apr 17 12:40:59 ...
clurgmgrd[15696]: <emerg> #1: Quorum Dissolved

[root@hsmsc50sfe1a ~]# clustat
Service states unavailable: Operation requires quorum
Cluster Status for iSFE14081 @ Tue Apr 17 12:41:47 2012
Member Status: Inquorate

Member Name                                                     ID   Status
------ ----                                                     ---- ------
chb_sfe1a                                                           1 Online, Local
chb_sfe2a                                                           2 Offline

The cluster node then become "Disallowed":
[root@hsmsc50sfe1a ~]# cman_tool nodes
NOTE: There are 1 disallowed nodes,
      members list may seem inconsistent across the cluster
Node  Sts   Inc   Joined               Name
   1   M    872   2012-03-25 08:43:07  chb_sfe1a
   2   d    876   2012-03-25 12:46:09  chb_sfe2a


Version-Release number of selected component (if applicable):


How reproducible:
On cluster machine configure ntp server and then use ntpdate command

Steps to Reproduce:
1. Configure the NTP server
2. Apply relevant NTP configuration on the cluster machine
3. Stop ntp process
4. Type ntpdate <NTP_server_IP>
5. Then the below error appears:
Message from syslogd@hsmsc50sfe1a at Apr 17 12:40:59 ...
clurgmgrd[15696]: <emerg> #1: Quorum Dissolved
  

Actual results:


Expected results:


Additional info:
Output of /var/log/messages:
Apr 17 12:36:19 hsmsc50sfe1a ntpd[10848]: 0 makes a poor control keyid
Apr 17 12:36:19 hsmsc50sfe1a ntpd[10848]: frequency initialized 0.000 PPM from /var/lib/ntp/drift
Apr 17 12:37:58 hsmsc50sfe1a ntpd[10848]: ntpd exiting on signal 15
Apr 17 12:40:57 hsmsc50sfe1a openais[5017]: [TOTEM] The token was lost in the OPERATIONAL state.
Apr 17 12:40:57 hsmsc50sfe1a openais[5017]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
Apr 17 12:40:57 hsmsc50sfe1a openais[5017]: [TOTEM] Transmit multicast socket send buffer size (320000 bytes).
Apr 17 12:40:57 hsmsc50sfe1a openais[5017]: [TOTEM] entering GATHER state from 2.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering GATHER state from 0.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Creating commit token because I am the rep.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Storing new sequence id for ring 370
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering COMMIT state.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering RECOVERY state.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] position [0] member 192.168.1.49:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] previous ring seq 876 rep 192.168.1.49
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] aru 2e1 high delivered 2e1 received flag 1
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Did not need to originate any messages in recovery.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Sending initial ORF token
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] CLM CONFIGURATION CHANGE
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] New Configuration:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] #011r(0) ip(192.168.1.49)
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Left:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] #011r(0) ip(192.168.1.51)
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Joined:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CMAN ] quorum lost, blocking activity
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] CLM CONFIGURATION CHANGE
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] New Configuration:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] #011r(0) ip(192.168.1.49)
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Left:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Joined:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [SYNC ] This node is within the primary component and will provide service.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering OPERATIONAL state.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] got nodejoin message 192.168.1.49
Apr 17 12:40:59 hsmsc50sfe1a clurgmgrd[15696]: <emerg> #1: Quorum Dissolved
Apr 17 12:40:59 hsmsc50sfe1a kernel: dlm: closing connection to node 2
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CPG  ] got joinlist message from node 1
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Cluster is not quorate.  Refusing connection.
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Error while processing connect: Connection refused
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Invalid descriptor specified (-111).
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Someone may be attempting something evil.
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Error while processing get: Invalid request descriptor
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Invalid descriptor specified (-111).
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Someone may be attempting something evil.
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Error while processing get: Invalid request descriptor
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Invalid descriptor specified (-21).
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Someone may be attempting something evil.
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Error while processing disconnect: Invalid request descriptor
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Cluster is not quorate.  Refusing connection.
Apr 17 12:40:59 hsmsc50sfe1a ccsd[5003]: Error while processing connect: Connection refused
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering GATHER state from 11.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Creating commit token because I am the rep.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Storing new sequence id for ring 374
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering COMMIT state.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] entering RECOVERY state.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] position [0] member 192.168.1.49:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] previous ring seq 880 rep 192.168.1.49
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] aru e high delivered e received flag 1
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] position [1] member 192.168.1.51:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] previous ring seq 880 rep 192.168.1.51
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] aru e high delivered e received flag 1
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Did not need to originate any messages in recovery.
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [TOTEM] Sending initial ORF token
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] CLM CONFIGURATION CHANGE
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] New Configuration:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] #011r(0) ip(192.168.1.49)
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Left:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] Members Joined:
Apr 17 12:40:59 hsmsc50sfe1a openais[5017]: [CLM  ] CLM CONFIGURATION CHANGE

Comment 1 John Ha 2012-06-08 19:27:08 UTC
Cluster_Administration component is for the documentation package for Cluster Suite. Reassigning this to the cman component as it involves quorum in the cluster product itself.

Comment 2 Lon Hohberger 2013-02-22 19:49:31 UTC
This slipped through the cracks and was resolved by an openais erratum some time ago:

http://rhn.redhat.com/errata/RHBA-2012-0180.html

*** This bug has been marked as a duplicate of bug 738468 ***


Note You need to log in before you can comment on or make changes to this bug.