Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 500408

Summary: qdisk failures when one node is killed in cluster
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmanAssignee: Lon Hohberger <lhh>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 4CC: cluster-maint, edamato
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-12 18:05:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log from taft-01
none
log from taft-02
none
log from taft-03
none
log from taft-04 none

Description Corey Marthaler 2009-05-12 15:27:51 UTC
Description of problem:
Twice now while running revolver on the taft cluster (4 nodes) for the 4.8 regression run, qdisk was unable to deal with a node being shot.


May 12 09:59:19 taft-02 qdiskd[15296]: <warning> qdisk cycle took more than 1 second to complete (1.060000)
May 12 09:59:35 taft-02 qarshd[15549]: Talking to peer 10.15.89.99:57342
May 12 09:59:35 taft-02 qarshd[15549]: Running cmdline: cman_tool status
May 12 09:59:54 taft-02 qdiskd[15296]: <info> Assuming master role
May 12 09:59:55 taft-02 qdiskd[15296]: <notice> Writing eviction notice for node 1
May 12 09:59:55 taft-02 kernel: CMAN: removing node taft-01 from the cluster : Killed by another node
May 12 09:59:56 taft-02 qdiskd[15296]: <notice> Node 1 evicted
May 12 09:59:57 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:57 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:58 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:58 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:59 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qarshd[15563]: Talking to peer 10.15.89.99:57347
May 12 09:59:59 taft-02 qarshd[15563]: Running cmdline: cat /proc/cluster/nodes
May 12 10:00:00 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:00 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:01 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Started transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 8
May 12 10:00:02 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:02 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:03 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:03 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
[...]

The above messages then scroll for eternity...

I'll attach the logs.


Version-Release number of selected component (if applicable):
2.6.9-89.ELsmp
cman-1.0.27-1.el4

How reproducible:
2/2

Comment 1 Corey Marthaler 2009-05-12 15:36:51 UTC
Created attachment 343600 [details]
log from taft-01

Comment 2 Corey Marthaler 2009-05-12 15:37:25 UTC
Created attachment 343601 [details]
log from taft-02

Comment 3 Corey Marthaler 2009-05-12 15:37:55 UTC
Created attachment 343602 [details]
log from taft-03

Comment 4 Corey Marthaler 2009-05-12 15:38:22 UTC
Created attachment 343603 [details]
log from taft-04

Comment 5 Lon Hohberger 2009-05-12 18:05:01 UTC

*** This bug has been marked as a duplicate of bug 490147 ***