Bug 131002 - second and third mount attempts on recovered node hangs
second and third mount attempts on recovered node hangs
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: David Teigland
GFS Bugs
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-08-26 11:54 EDT by Corey Marthaler
Modified: 2010-01-11 21:57 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:41:45 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2004-08-26 11:54:35 EDT
Description of problem:
After morph-06 paniced, the cluster went through recovery and
conituned with I/O. The filesystems continued to be accessable.

I then brought morph-06 back into the cluster and attempted to mount
the 5 filesystems but after the first filesystem mounted successfully
the remaining attempts hung.

How reproducible:
Didn't try
Comment 1 Corey Marthaler 2004-08-26 11:55:06 EDT
morph-01:
[root@morph-01 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[3 2 4 5 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[3 4 5 6 2]

DLM Lock Space:  "corey3"                            9  10 run       -
[3 4 5 6 2]

DLM Lock Space:  "corey4"                           11  12 run       -
[3 4 5 6 2]

GFS Mount Group: "corey0"                            4   5 run       -
[3 4 5 6 2 1]

GFS Mount Group: "corey1"                            6   7 update   
U-4,1,1
[3 4 5 6 2 1]

GFS Mount Group: "corey2"                            8   9 run       -
[3 4 5 6 2]

GFS Mount Group: "corey3"                           10  11 run       -
[3 4 5 6 2]

GFS Mount Group: "corey4"                           12  13 run       -
[3 4 5 6 2]


morph-02:
[root@morph-02 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[3 2 4 5 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[3 4 5 6 2 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[3 4 5 6 2]

DLM Lock Space:  "corey3"                            9  10 run       -
[3 4 5 6 2]

DLM Lock Space:  "corey4"                           11  12 run       -
[3 4 5 6 2]

GFS Mount Group: "corey0"                            4   5 run       -
[3 4 5 6 2 1]

GFS Mount Group: "corey1"                            6   7 update   
U-4,1,1
[3 4 5 6 2 1]

GFS Mount Group: "corey2"                            8   9 run       -
[3 4 5 6 2]

GFS Mount Group: "corey3"                           10  11 run       -
[3 4 5 6 2]

GFS Mount Group: "corey4"                           12  13 run       -
[3 4 5 6 2]


morph-03:
[root@morph-03 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[4 3 5 6 2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[4 2 3 5 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[4 3 5 6 2 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[4 3 5 6 2 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[4 3 5 6 2]

DLM Lock Space:  "corey3"                            9  10 run       -
[4 3 5 6 2]

DLM Lock Space:  "corey4"                           11  12 run       -
[4 3 5 6 2]

GFS Mount Group: "corey0"                            4   5 run       -
[4 3 5 6 2 1]

GFS Mount Group: "corey1"                            6   7 update   
U-4,1,1
[4 3 5 6 2 1]

GFS Mount Group: "corey2"                            8   9 run       -
[4 3 5 6 2]

GFS Mount Group: "corey3"                           10  11 run       -
[4 3 5 6 2]

GFS Mount Group: "corey4"                           12  13 run       -
[4 3 5 6 2]


morph-04:
[root@morph-04 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[5 3 4 6 2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[5 3 2 4 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[5 3 4 6 2 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[5 3 4 6 2 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[5 3 4 6 2]

DLM Lock Space:  "corey3"                            9  10 run       -
[5 3 4 6 2]

DLM Lock Space:  "corey4"                           11  12 run       -
[5 3 4 6 2]

GFS Mount Group: "corey0"                            4   5 run       -
[5 3 4 6 2 1]

GFS Mount Group: "corey1"                            6   7 update   
U-4,1,1
[5 3 4 6 2 1]

GFS Mount Group: "corey2"                            8   9 run       -
[5 3 4 6 2]

GFS Mount Group: "corey3"                           10  11 run       -
[5 3 4 6 2]

GFS Mount Group: "corey4"                           12  13 run       -
[5 3 4 6 2]


morph-05:
[root@morph-05 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[2 3 4 5 6]

DLM Lock Space:  "corey3"                            9  10 run       -
[2 3 4 5 6]

DLM Lock Space:  "corey4"                           11  12 run       -
[2 3 4 5 6]

GFS Mount Group: "corey0"                            4   5 run       -
[2 3 4 5 6 1]

GFS Mount Group: "corey1"                            6   7 update   
U-4,1,1
[2 3 4 5 6 1]

GFS Mount Group: "corey2"                            8   9 run       -
[2 3 4 5 6]

GFS Mount Group: "corey3"                           10  11 run       -
[2 3 4 5 6]

GFS Mount Group: "corey4"                           12  13 run       -
[2 3 4 5 6]


morph-06:
[root@morph-06 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[2 3 4 5 6 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[2 3 4 5 6 1]

GFS Mount Group: "corey0"                            4   5 run       -
[2 3 4 5 6 1]

GFS Mount Group: "corey1"                            6   7 join     
S-6,20,6
[2 3 4 5 6 1]
Comment 2 Corey Marthaler 2004-08-27 16:26:15 EDT
I was able to reproduce this mount hang using revolver and by just
shooting one node
Comment 3 Corey Marthaler 2004-08-27 16:28:44 EDT
[root@morph-06 root]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[5 4 3 2 6 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[5 4 3 2 6 1]

DLM Lock Space:  "corey0"                            3   4 run       -
[5 4 3 2 6 1]

DLM Lock Space:  "corey1"                            5   6 run       -
[5 4 3 2 6 1]

DLM Lock Space:  "corey2"                            7   8 run       -
[5 4 3 2 6 1]

GFS Mount Group: "corey0"                            4   5 run       -
[5 4 3 2 6 1]

GFS Mount Group: "corey1"                            6   7 run       -
[5 4 3 2 6 1]

GFS Mount Group: "corey2"                            8   9 join     
S-6,20,6
[5 4 3 2 6 1]


Comment 4 David Teigland 2004-10-26 01:53:02 EDT
I recently fixed a dlm bug that could cause any gfs mount to hang.
It could be the culprit here.
Comment 5 Corey Marthaler 2004-10-29 17:40:10 EDT
unable to reproduce. marking fixed.
Comment 6 Kiersten (Kerri) Anderson 2004-11-16 14:02:40 EST
Updating version to the right level in the defects.  Sorry for the storm.
Comment 7 Wade Mealing 2007-08-02 03:27:29 EDT
I -think- i have been able to reproduce this.  Run this on each of the nodes,
wait about 5 hours, two of the nodes were able to continue to mount and unmount,
one was not.. it hung at mounting. Approximately one hour later, the second node
from my three cluster setup hung at unmounting.

Admittedly, this is a bit brutish and I think it may expose the same problem. I
have no access to revolver.

kernel 2.6.9-55.0.2.EL , and related packages.

#!/bin/bash

i="0"

while [ $i -lt 1 ]
do
echo "Mounting ... "
mount -t gfs /dev/hdb1 /mnt/test
echo "Unmounting ..." 
umount /mnt/test
done


Comment 8 David Teigland 2009-01-20 15:41:45 EST
Closing again, this was fixed/closed in 2004.
comment 7 would have been something different.

Note You need to log in before you can comment on or make changes to this bug.