Bug 500408 - qdisk failures when one node is killed in cluster
qdisk failures when one node is killed in cluster
Status: CLOSED DUPLICATE of bug 490147
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cman (Show other bugs)
4
All Linux
high Severity high
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-05-12 11:27 EDT by Corey Marthaler
Modified: 2009-05-12 14:05 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-12 14:05:01 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
log from taft-01 (60.21 KB, text/plain)
2009-05-12 11:36 EDT, Corey Marthaler
no flags Details
log from taft-02 (267.57 KB, text/plain)
2009-05-12 11:37 EDT, Corey Marthaler
no flags Details
log from taft-03 (267.46 KB, text/plain)
2009-05-12 11:37 EDT, Corey Marthaler
no flags Details
log from taft-04 (267.31 KB, text/plain)
2009-05-12 11:38 EDT, Corey Marthaler
no flags Details

  None (edit)
Description Corey Marthaler 2009-05-12 11:27:51 EDT
Description of problem:
Twice now while running revolver on the taft cluster (4 nodes) for the 4.8 regression run, qdisk was unable to deal with a node being shot.


May 12 09:59:19 taft-02 qdiskd[15296]: <warning> qdisk cycle took more than 1 second to complete (1.060000)
May 12 09:59:35 taft-02 qarshd[15549]: Talking to peer 10.15.89.99:57342
May 12 09:59:35 taft-02 qarshd[15549]: Running cmdline: cman_tool status
May 12 09:59:54 taft-02 qdiskd[15296]: <info> Assuming master role
May 12 09:59:55 taft-02 qdiskd[15296]: <notice> Writing eviction notice for node 1
May 12 09:59:55 taft-02 kernel: CMAN: removing node taft-01 from the cluster : Killed by another node
May 12 09:59:56 taft-02 qdiskd[15296]: <notice> Node 1 evicted
May 12 09:59:57 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:57 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:58 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:58 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 09:59:59 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 09:59:59 taft-02 qarshd[15563]: Talking to peer 10.15.89.99:57347
May 12 09:59:59 taft-02 qarshd[15563]: Running cmdline: cat /proc/cluster/nodes
May 12 10:00:00 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:00 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:01 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Started transition, generation 7
May 12 10:00:01 taft-02 kernel: CMAN: Initiating transition, generation 8
May 12 10:00:02 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:02 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
May 12 10:00:03 taft-02 qdiskd[15296]: <crit> Node 1 is undead.
May 12 10:00:03 taft-02 qdiskd[15296]: <alert> Writing eviction notice for node 1
[...]

The above messages then scroll for eternity...

I'll attach the logs.


Version-Release number of selected component (if applicable):
2.6.9-89.ELsmp
cman-1.0.27-1.el4

How reproducible:
2/2
Comment 1 Corey Marthaler 2009-05-12 11:36:51 EDT
Created attachment 343600 [details]
log from taft-01
Comment 2 Corey Marthaler 2009-05-12 11:37:25 EDT
Created attachment 343601 [details]
log from taft-02
Comment 3 Corey Marthaler 2009-05-12 11:37:55 EDT
Created attachment 343602 [details]
log from taft-03
Comment 4 Corey Marthaler 2009-05-12 11:38:22 EDT
Created attachment 343603 [details]
log from taft-04
Comment 5 Lon Hohberger 2009-05-12 14:05:01 EDT

*** This bug has been marked as a duplicate of bug 490147 ***

Note You need to log in before you can comment on or make changes to this bug.