Bug 500408 - qdisk failures when one node is killed in cluster
Summary: qdisk failures when one node is killed in cluster
Keywords:
Status: CLOSED DUPLICATE of bug 490147
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cman
Version: 4
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-05-12 15:27 UTC by Corey Marthaler
Modified: 2009-05-12 18:05 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2009-05-12 18:05:01 UTC
Embargoed:


Attachments (Terms of Use)
log from taft-01 (60.21 KB, text/plain)
2009-05-12 15:36 UTC, Corey Marthaler
no flags Details
log from taft-02 (267.57 KB, text/plain)
2009-05-12 15:37 UTC, Corey Marthaler
no flags Details
log from taft-03 (267.46 KB, text/plain)
2009-05-12 15:37 UTC, Corey Marthaler
no flags Details
log from taft-04 (267.31 KB, text/plain)
2009-05-12 15:38 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2009-05-12 15:27:51 UTC
Description of problem:
Twice now while running revolver on the taft cluster (4 nodes) for the 4.8 regression run, qdisk was unable to deal with a node being shot.


May 12 09:59:19 taft-02 qdiskd[15296]: <warning> qdisk cycle took more than 1 second to complete (1.060000)
May 12 09:59:35 taft-02 qarshd[15549]: Talking to peer 10.15.89.99:57342
May 12 09:59:35 taft-02 qarshd[15549]: Running cmdline: cman_tool status
May 12 09:59:54 taft-02 qdiskd[15296]: <info> Assuming master role
May 12 09:59:55 taft-02 qdiskd[15296]: <notice> Writing eviction notice for node 1
May 12 09:59:55 taft-02 kernel: CMAN: removing node taft-01 from the cluster : Killed by another node
May 12 09:59:56 taft-02 qdiskd[15296]: <notice> Node 1 evicted
May 12 09:59:57 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:57 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:58 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:58 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:59 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qarshd[15563]: Talking to peer 10.15.89.99:57347
May 12 09:59:59 taft-02 qarshd[15563]: Running cmdline: cat /proc/cluster/nodes
May 12 10:00:00 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:00 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:01 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Started transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 8
May 12 10:00:02 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:02 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:03 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:03 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
[...]

The above messages then scroll for eternity...

I'll attach the logs.


Version-Release number of selected component (if applicable):
2.6.9-89.ELsmp
cman-1.0.27-1.el4

How reproducible:
2/2

Comment 1 Corey Marthaler 2009-05-12 15:36:51 UTC
Created attachment 343600 [details]
log from taft-01

Comment 2 Corey Marthaler 2009-05-12 15:37:25 UTC
Created attachment 343601 [details]
log from taft-02

Comment 3 Corey Marthaler 2009-05-12 15:37:55 UTC
Created attachment 343602 [details]
log from taft-03

Comment 4 Corey Marthaler 2009-05-12 15:38:22 UTC
Created attachment 343603 [details]
log from taft-04

Comment 5 Lon Hohberger 2009-05-12 18:05:01 UTC

*** This bug has been marked as a duplicate of bug 490147 ***


Note You need to log in before you can comment on or make changes to this bug.