Bug 147374 - 3 node loss of quorum: can still read/write GFS
3 node loss of quorum: can still read/write GFS
Status: CLOSED NOTABUG
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Ken Preslan
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-02-07 13:28 EST by Derek Anderson
Modified: 2010-01-11 22:03 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-02-08 11:22:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Anderson 2005-02-07 13:28:29 EST
Description of problem:
Have a 3-node cluster running ccsd/cman/fence/dlm/clvmd/gfs.  1 GFS
filesystem with lock_dlm locking mounted on all.  Take out 2 of the 3
nodes.  The remaining node reports "Activity blocked" yet it can still
read and write the GFS.  The fence domain and clvmd lockspace are in
"recovery", but the filesystem lock space remains in the "run" state.

Maybe this is expected behavior since there is only a single mounter
left running.  ???  I have only 3 nodes set up with RHEL4 so maybe
corey/dean can try to see what happens with loss of quorum where there
are multiple mounters left alive.

### The GFS...
/dev/VG1/LV1 on /mnt/gfs1 type gfs (rw,noatime,nodiratime)

### Cluster state while quorate...
Node  Votes Exp Sts  Name
   1    1    3   M   link-12
   2    1    3   M   link-11
   3    1    3   M   link-10
Protocol version: 5.0.1
Config version: 2
Cluster name: MILTON
Cluster ID: 4812
Membership state: Cluster-Member
Nodes: 3
Expected_votes: 3
Total_votes: 3
Quorum: 2
Active subsystems: 6
Node addresses: 192.168.44.162

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[1 3 2]

DLM Lock Space:  "clvmd"                             2   3 run       -
[1 2 3]

DLM Lock Space:  "data1"                            10  10 run       -
[1 2 3]

GFS Mount Group: "data1"                            11  11 run       -
[1 2 3]

### Cluster state after killing link-10 and link-11...
[root@link-12 gfs1]# clu_state
Node  Votes Exp Sts  Name
   1    1    3   M   link-12
   2    1    3   X   link-11
   3    1    3   X   link-10
Protocol version: 5.0.1
Config version: 2
Cluster name: MILTON
Cluster ID: 4812
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 3
Total_votes: 1
Quorum: 2  Activity blocked
Active subsystems: 6
Node addresses: 192.168.44.162

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 recover 0 -
[1]

DLM Lock Space:  "clvmd"                             2   3 recover 0 -
[1]

DLM Lock Space:  "data1"                            10  10 run       -
[1]

GFS Mount Group: "data1"                            11  11 run       -
[1]

### Access the GFS...
[root@link-12 gfs1]# ls
link-10  link-11  link-12
[root@link-12 gfs1]# echo "sdlfkjskldfjlak" >newfile
[root@link-12 gfs1]# ls
link-10  link-11  link-12  newfile
[root@link-12 gfs1]# create_missing -n 10 -d . -v
directory: .
filesize:  16K
number:    10
blocksize: 1
Created ./file1
Created ./file2
Created ./file3
Created ./file4
Created ./file5
Created ./file6
Created ./file7
Created ./file8
Created ./file9
Created ./file10
10 files created
[root@link-12 gfs1]# ls
file1   file2  file4  file6  file8  link-10  link-12
file10  file3  file5  file7  file9  link-11  newfile
[root@link-12 gfs1]# pwd
/mnt/gfs1

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Derek Anderson 2005-02-07 13:51:21 EST
My mistake.  In the cases where GFS activity was allowed after 
quorum loss, the remaining node was the only one to have mounted the 
GFS.  So there are no other journals to replay when the other nodes 
die.  This is OK, even though the cluster status is Activity 
Blocked?  If so this can be closed as NOTABUG. 
Comment 2 David Teigland 2005-02-08 10:12:51 EST
That's correct, not-a-bug.

Note You need to log in before you can comment on or make changes to this bug.