Bug 512799

Summary:

Two-Node cluster does not function properly if a node is shutdown.

Product:

Red Hat Enterprise Linux 5

Reporter:

Everett Bennett <everett.bennett>

Component:

cman

Assignee:

David Teigland <teigland>

Status:

CLOSED DUPLICATE

QA Contact:

Cluster QE <mspqa-list>

Severity:

urgent

Docs Contact:

Priority:

low

Version:

5.3

CC:

adas, bmarzins, cluster-maint, edamato, everett.bennett, lhh, mwaite, rpeterso, swhiteho, teigland

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2009-07-22 20:27:54 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Cluster configuration file for 2node cluster	none
Current logs from running node, ibmblade07	none

Description Everett Bennett 2009-07-20 19:44:03 UTC

Description of problem:

When 1 node of a 2-node RHEL 5.3 Cluster is shutdown and halted, the 'service'
starts up on 2nd node, but hangs and never completes startup.

Version-Release number of selected component (if applicable):

cman-2.0.98-1.el5_3.1
gfs-utils-0.1.18-1.el5
kmod-gfs-0.1.31-3.el5
cman-devel-2.0.98-1.el5_3.1
kmod-gfs-xen-0.1.31-3.el5
gfs2-utils-0.1.53-1.el5_3.3
rgmanager-2.0.46-1.0.1.el5_3.3



How reproducible:

Simply shutdown the node currently running the service to force failover.

Steps to Reproduce:
1.  Master and Slave services are running.
2.  Through IBM BladeCenter Remote Console, shutdown and power off master.
3.  Wait for 2nd node to start master service.
  
Actual results:

Master services starts up on 2nd node, but never completes start even
when the other node is powered back up.

Expected results:

Master service should start on 2nd node.

Additional info:

Comment 1 Everett Bennett 2009-07-20 19:55:47 UTC


ibmblade05> Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] entering GATHER state from 11. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] Creating commit token because I am the rep. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] Saving state aru f high seq received f 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] Storing new sequence id for ring 140 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] entering COMMIT state. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] entering RECOVERY state. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] position [0] member 158.155.4.146: 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] previous ring seq 316 rep 158.155.4.146 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] aru f high delivered f received flag 1 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] position [1] member 158.155.4.148: 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] previous ring seq 316 rep 158.155.4.148 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] aru a high delivered a received flag 1 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] Did not need to originate any messages in recovery. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] Sending initial ORF token 
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Membership Change Event 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] CLM CONFIGURATION CHANGE 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] New Configuration: 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] 	r(0) ip(158.155.4.146)  
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Node 1 is not listening 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] Members Left: 
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Membership Change Event 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] Members Joined: 
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Node 1 is not listening 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] CLM CONFIGURATION CHANGE 
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Membership Change Event 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] New Configuration: 
Jul 20 15:26:09 ibmblade05 clurgmgrd[5073]: <debug> Node 1 is not listening 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] 	r(0) ip(158.155.4.146)  
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] 	r(0) ip(158.155.4.148)  
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] Members Left: 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] Members Joined: 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] 	r(0) ip(158.155.4.148)  
Jul 20 15:26:09 ibmblade05 openais[3837]: [SYNC ] This node is within the primary component and will provide service. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [TOTEM] entering OPERATIONAL state. 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] got nodejoin message 158.155.4.146 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CLM  ] got nodejoin message 158.155.4.148 
Jul 20 15:26:09 ibmblade05 openais[3837]: [CPG  ] got joinlist message from node 2 
Jul 20 15:26:47 ibmblade05 clurgmgrd[5073]: <debug> Event: Port Opened 
Jul 20 15:26:47 ibmblade05 clurgmgrd[5073]: <info> State change: ibmblade07 UP 

***
After the node ibmblade07 is shutdown, the application starts
on ibmblade05, but never completes.

ibmblade05> 
ibmblade05> 
ibmblade05> clustat
Cluster Status for ime2nodecl @ Mon Jul 20 15:29:40 2009
Member Status: Quorate

 Member Name                                 ID   Status
 ------ ----                                 ---- ------
 ibmblade07                                  1 Offline
 ibmblade05                                  2 Online, Local, RG-Master

 Service Name                   Owner (Last)                                  State         
 ------- ----                   ----- ------      -----
 service:imeappl01                ibmblade05            
 service:imeslave                 (ibmblade05)      
    
ibmblade05> ps -fucgi
UID        PID  PPID  C STIME TTY          TIME CMD
cgi      11374 11165  0 14:59 ?        00:00:00 cgi_daemon -u cgi cgi_startup start
cgi      11375 11374  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0/bin/cgi_startup start
cgi      11537 11375  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0/bin/cgi_startup start -nolog
cgi      11916     1  0 14:59 ?        00:00:00 /opt/dcs_6.0/appst/bin/task_manager
cgi      11917     1  0 14:59 ?        00:00:00 cgi_mlog -s 10000000 -o /var/dcs_6.0/mlog/task_manager.log
cgi      11994     1  0 14:59 ?        00:00:00 /opt/dcs_6.0/dcs/bin/dcs_tape_rpc_serv
cgi      11995     1  0 14:59 ?        00:00:00 cgi_mlog -s 10000000 -o /var/dcs_6.0/mlog/dcs_tape_rpc_serv.log
cgi      12044 11537  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0/uio/files/startup.sh check_db
cgi      12056 12044  0 14:59 ?        00:00:00 /usr/local/bin/bash /opt/dcs_6.0/bin/cgi_check_databases -no_server UIO_DB_PATH:uio_maint_values_db
cgi      12057 12056  0 14:59 ?        00:00:00 /usr/local/bin/bash /opt/dcs_6.0/bin/cgi_check_databases -no_server UIO_DB_PATH:uio_maint_values_db
cgi      12058 12057  0 14:59 ?        00:00:00 uio_check UIO_DB_PATH:uio_maint_values_db




After the 1st node, ibmblade07, is booted back up, the application remains
in the same state:

ibmblade05> clustat
Cluster Status for ime2nodecl @ Mon Jul 20 15:54:35 2009
Member Status: Quorate

 Member Name                          ID   Status
 ------ ----                          ---- ------
 ibmblade07                               1 Online, RG-Worker
 ibmblade05                               2 Online, Local, RG-Master

 Service Name                Owner (Last)                State         
 ------- ----                ----- ------                -----         
 service:imeappl01           ibmblade05                  starting      
 service:imeslave            (ibmblade05)                stopped       
ibmblade05> ps -fucgi
UID        PID  PPID  C STIME TTY          TIME CMD
cgi      11374 11165  0 14:59 ?        00:00:00 cgi_daemon -u cgi cgi_start
cgi      11375 11374  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash
cgi      11537 11375  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash
cgi      11916     1  0 14:59 ?        00:00:00 /opt/dcs_6.0/appst/bin/task
cgi      11917     1  0 14:59 ?        00:00:00 cgi_mlog -s 10000000 -o /va
cgi      11994     1  0 14:59 ?        00:00:00 /opt/dcs_6.0/dcs/bin/dcs_ta
cgi      11995     1  0 14:59 ?        00:00:00 cgi_mlog -s 10000000 -o /va
cgi      12044 11537  0 14:59 ?        00:00:00 /opt/dcs/usr/local/bin/bash
cgi      12056 12044  0 14:59 ?        00:00:00 /usr/local/bin/bash /opt/dc
cgi      12057 12056  0 14:59 ?        00:00:00 /usr/local/bin/bash /opt/dc
cgi      12058 12057  0 14:59 ?        00:00:00 uio_check UIO_DB_PATH:uio_m
ibmblade05>

Comment 2 Everett Bennett 2009-07-20 19:57:25 UTC

Created attachment 354384 [details]
Cluster configuration file for 2node cluster

Comment 3 Everett Bennett 2009-07-20 19:58:23 UTC

Kernel level:

ibmblade05> uname -a
Linux ibmblade05 2.6.18-128.1.10.0.1.el5 #1 SMP Thu May 7 17:49:20 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

Comment 4 Lon Hohberger 2009-07-21 14:27:34 UTC

Processes stuck in disk-wait can't be killed (even with -9).  There are a number of reasons this can occur.

The service is stuck in the 'starting' state which is probably correct.

Need:

- output of 'cman_tool nodes'
- output of 'group_tool ls'
- sysrq-trace (echo t > /proc/sysrq-trigger)
- full logs (after taking the sysrq-trace)

Comment 5 Lon Hohberger 2009-07-21 14:27:55 UTC

Oh, and

- output of 'cman_tool status'

Comment 6 Everett Bennett 2009-07-21 15:31:36 UTC

Created attachment 354503 [details]
Current logs from running node, ibmblade07

Current logs from running node, ibmblade07

Comment 7 Everett Bennett 2009-07-21 15:33:54 UTC

ibmblade07> ps -fucgi
UID        PID  PPID  C STIME TTY          TIME CMD
cgi        615     1  0 11:21 ?        00:00:00 /opt/dcs_6.0/appst/bin/task_manager
cgi        616     1  0 11:21 ?        00:00:00 cgi_mlog -s 10000000 -o /var/dcs_6.0/mlo
cgi        694     1  0 11:21 ?        00:00:00 /opt/dcs_6.0/dcs/bin/dcs_tape_rpc_serv
cgi        695     1  0 11:21 ?        00:00:00 cgi_mlog -s 10000000 -o /var/dcs_6.0/mlo
cgi        751 32690  0 11:21 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0
cgi        765   751  0 11:21 ?        00:00:00 /usr/local/bin/bash /opt/dcs_6.0/bin/cgi
cgi        766   765  0 11:21 ?        00:00:00 /usr/local/bin/bash /opt/dcs_6.0/bin/cgi
cgi        767   766  0 11:21 ?        00:00:00 uio_check UIO_DB_PATH:uio_maint_values_d
cgi      32527 32318  0 11:21 ?        00:00:00 cgi_daemon -u cgi cgi_startup start
cgi      32528 32527  0 11:21 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0
cgi      32690 32528  0 11:21 ?        00:00:00 /opt/dcs/usr/local/bin/bash /opt/dcs_6.0
ibmblade07> 


ibmblade07> clustat
Cluster Status for ime2nodecl @ Tue Jul 21 11:32:37 2009
Member Status: Quorate

 Member Name                                 ID   Status
 ------ ----                                 ---- ------
 ibmblade07                                      1 Online, Local, RG-Master
 ibmblade05                                      2 Offline

 Service Name                       Owner (Last)                       State         
 ------- ----                       ----- ------                       -----         
 service:imeappl01                  ibmblade07                         starting      
 service:imeslave                   (ibmblade07)                       stopped 


ibmblade07> cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M    336   2009-07-21 11:15:57  ibmblade07
   2   X    344                        ibmblade05


ibmblade07> group_tool ls
type             level name       id       state       
fence            0     default    00010002 none        
[1]
dlm              1     clvmd      00020002 none        
[1]
dlm              1     appl01     00040002 none        
[1]
dlm              1     data01     00060002 none        
[1]
dlm              1     data02     00080002 none        
[1]
dlm              1     data03     000a0002 none        
[1]
dlm              1     rgmanager  000b0002 none        
[1]
gfs              2     appl01     00030002 none        
[1]
gfs              2     data01     00050002 none        
[1]
gfs              2     data02     00070002 none        
[1]
gfs              2     data03     00090002 none        
[1]
ibmblade07> 


bmblade07> cman_tool status
Version: 6.1.0
Config Version: 5
Cluster Name: ime2nodecl
Cluster Id: 39062
Cluster Member: Yes
Cluster Generation: 348
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Quorum: 1  
Active subsystems: 9
Flags: 2node Dirty 
Ports Bound: 0 11 177  
Node name: ibmblade07
Node ID: 1
Multicast addresses: 239.192.152.47 
Node addresses: 158.155.4.148

Comment 8 Lon Hohberger 2009-07-21 17:54:46 UTC

Ok, so the cluster states all look fine.  The DLM isn't blocked on
fencing, and none of the GFS groups are in recovery (which typically
they are when they block I/O and so forth).  It sounds like some of
your processes are waiting for something to occur on the GFS (or
GFS2) side of things, and that this isn't a problem with rgmanager,
which appears to be operating.

Looking at your logs, you have a process in the disk-wait state,
presumably one that should have completed starting some time ago:

 uio_check     D ffff810001014f00     0   767    766          (NOTLB)
  ffff81082079fe08 0000000000000086 ffff81082079fd68 0000000000000092
  0000000000000000 0000000000000002 ffff81081a33b100 ffff81011cb9d080
  000007da158b7269 000000000001a5e8 ffff81081a33b2e8 0000000280088904
 Call Trace:
  [<ffffffff885b1a69>] :dlm:dlm_posix_lock+0x172/0x210
  [<ffffffff8009dbfb>] autoremove_wake_function+0x0/0x2e
  [<ffffffff885e0c72>] :gfs2:gfs2_lock+0xc3/0xcf
  [<ffffffff80039903>] fcntl_setlk+0x11e/0x273
  [<ffffffff800db57d>] sys_fstatfs+0x57/0x68
  [<ffffffff800b4702>] audit_syscall_entry+0x16e/0x1a1
  [<ffffffff8002e25b>] sys_fcntl+0x269/0x2dc
  [<ffffffff8005d28d>] tracesys+0xd5/0xe0


My guess is that it's stuck because of one of these GFS2 quotad threads
(I cut out the stack traces from the first three because they're all
the same):

 gfs2_quotad  D ffff810001014f00  0 27119   281   27131 27118 (L-TLB)
  ...
 gfs2_quotad  D ffff810001014f00  0 27145   281   27155 27144 (L-TLB)
  ...
 gfs2_quotad  D ffff810001014f00  0 27175   281   27186 27174 (L-TLB)
  ...
 gfs2_quotad  D ffff810001014f00  0 27199   281   27227 27198 (L-TLB)
  ffff81080deabe00 0000000000000046 ffff81080fccc000 ffffffff885d5fe5
  ffff81080e2f7990 000000000000000a ffff81080de940c0 ffff81011cb9d080
  00000804ab028d60 0000000000000709 ffff81080de942a8 00000002885eb311
 Call Trace:
  [<ffffffff885d5fe5>] :gfs2:gfs2_glock_dq+0x117/0x132
  [<ffffffff8006395c>] schedule_timeout+0x8a/0xad
  [<ffffffff80095053>] process_timeout+0x0/0x5
  [<ffffffff885e6e75>] :gfs2:gfs2_quotad+0x194/0x1c2
  [<ffffffff8009dbfb>] autoremove_wake_function+0x0/0x2e
  [<ffffffff885e6ce1>] :gfs2:gfs2_quotad+0x0/0x1c2
  [<ffffffff8009d9e3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80032383>] kthread+0xfe/0x132
  [<ffffffff8005dfb1>] child_rip+0xa/0x11
  [<ffffffff8009d9e3>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80032285>] kthread+0x0/0x132
  [<ffffffff8005dfa7>] child_rip+0x0/0x11


Rgmanager also is not internally "stuck" (if it was, you would not
be able to run 'clustat'), leading me to believe that if a bug
exists that it is within GFS2 or the DLM and not rgmanager.

Comment 9 Lon Hohberger 2009-07-21 18:06:24 UTC

We'll need:

- dlm_tool lockdump [lockspace]
- dlm_tool lockdebug [lockspace]

... for each file system lockspace in use.  I think this is your list of lockspaces, but you should double check to be sure.  There is one file system lockspace for each GFS/GFS2 mount.

* appl01
* data01
* data02
* data03

Comment 10 Everett Bennett 2009-07-21 18:17:50 UTC

So, exactly how do I do this?


bmblade07> gfs2_tool df /dcs/appl01
/dcs/appl01:
  SB lock proto = "lock_dlm"
  SB lock table = "ime2nodecl:appl01"
  SB ondisk format = 1801
  SB multihost format = 1900
  Block size = 4096
  Journals = 8
  Resource Groups = 550
  Mounted lock proto = "lock_dlm"
  Mounted lock table = "ime2nodecl:appl01"
  Mounted host data = "jid=1:id=196610:first=0"
  Journal number = 1
  Lock module flags = 0
  Local flocks = FALSE
  Local caching = FALSE

  Type           Total          Used           Free           use%           
  ------------------------------------------------------------------------
  data           35709248       814235         34895013       2%
  inodes         34907929       12916          34895013       0%
ibmblade07> dlm_tool lockdump appl01
can't open /sys/kernel/debug/dlm/appl01_locks: No such file or directory
ibmblade07> dlm_tool lockdump ime2nodecl:appl01
can't open /sys/kernel/debug/dlm/ime2nodecl:appl01_locks: No such file or
directory
ibmblade07> dlm_tool lockdump ime2nodecl
can't open /sys/kernel/debug/dlm/ime2nodecl_locks: No such file or directory
ibmblade07> ls -al /sys/kernel/debug/

Comment 11 Everett Bennett 2009-07-21 18:26:53 UTC

I have IBM RDAC drivers to deal with, so, if I need to boot off a debug kernel, I will need to load the RDAC drivers into the debug kernel.

That could take some time.

Since I am recreating this issue on Dev servers rather than the customer site,
I can update to the latest RPM's and attempt to recreate the issue for further
research as well.

Just let me know.

Comment 12 David Teigland 2009-07-21 18:28:29 UTC

The process stuck in dlm_posix_lock may be a result of bug 474163 which is a bug in the plock_ownership mode.

Comment 13 Everett Bennett 2009-07-21 18:40:00 UTC

Is cman-2.0.100 available for RHEL 5.3?

Comment 14 Everett Bennett 2009-07-21 18:46:30 UTC

So, what would you like for me to do at this time?

Comment 15 David Teigland 2009-07-21 18:52:35 UTC

I'd suggest using plock_ownership="0" for now to avoid bug 474163, and try the tests with that.

Comment 16 Everett Bennett 2009-07-21 19:00:17 UTC

Well.  We rely on file locks.  Is this going to be a problem?
We have processes on 2+ nodes access files at the same time.

Comment 17 David Teigland 2009-07-21 19:13:17 UTC

plock_ownership is just a performance-related tunable.  It improves performance on file locks that are not shared between nodes, but happen to just be used (and frequently reused) from a single node.  For file locks that are used from multiple nodes it actually reduces performance somewhat.

Comment 18 Everett Bennett 2009-07-21 19:27:01 UTC

When we first started testing RHEL Clustering, we were using GFS .  Now with GFS2, is the line in cluster.conf needed?

ibmblade05> grep plock /etc/cluster/cluster.conf
	<gfs_controld plock_ownership="0" plock_rate_limit="0"/>


Setting plock_ownership to '0' seems to work.

ibmblade05> clustat
Cluster Status for ime2nodecl @ Tue Jul 21 15:26:04 2009
Member Status: Quorate

 Member Name                                 ID   Status
 ------ ----                                 ---- ------
 ibmblade07                                      1 Offline
 ibmblade05                                      2 Online, Local, RG-Master

 Service Name                       Owner (Last)                       State         
 ------- ----                       ----- ------                       -----         
 service:imeappl01                  ibmblade05                         started       
 service:imeslave                   (ibmblade05)                       stopped

Comment 19 David Teigland 2009-07-21 19:34:48 UTC

The gfs_controld daemon and the <gfs_controld/> cluster.conf config is used for both gfs1 and gfs2.

Comment 20 Everett Bennett 2009-07-21 19:56:20 UTC

It seems to fix the issue.  Should we 'set' plock_ownership to '0' for 2-node Clusters or all clusters?

Comment 21 David Teigland 2009-07-21 20:01:18 UTC

plock_ownership should be 0 for all clusters

Comment 22 Everett Bennett 2009-07-21 20:05:54 UTC

Is the recommended solution or should I tell the customer they may be a fix in the near future?

Comment 23 David Teigland 2009-07-21 20:26:52 UTC

I'd call this a recommended configuration.  An important detail I forgot to mention is that plock_ownership cannot be changed without stopping the cluster, changing cluster.conf on all nodes, then restarting the cluster.

It's possible that the application would benefit in the future by enbaling the plock_ownership optimization.  It all depends on how it uses posix locks.  If instances of the app on different nodes tend to mostly take their own separate locks, and repeatedly lock/unlock the same ones, then ownership can help.  But, if the same locks are most often taken from multiple nodes (as you'd expect from an app sharing files), then the optimization wouldn't likely help.

Comment 24 Everett Bennett 2009-07-22 13:20:31 UTC

Thanks for the info.
One other question, do you have any idea why the 'halted' node was not 'fenced' ?
What should you expect in a 2-node configuration?  Are there other configuration settings required to make 'fencing' work in a 2-node configuration?

Comment 25 Everett Bennett 2009-07-22 14:09:35 UTC

(In reply to comment #24)
> Thanks for the info.
> One other question, do you have any idea why the 'halted' node was not 'fenced'
> ?
> What should you expect in a 2-node configuration?  Are there other
> configuration settings required to make 'fencing' work in a 2-node
> configuration?  

Never mind.  It appears to be working properly now.

Comment 26 Everett Bennett 2009-07-22 14:32:35 UTC

Can you make any other recommendations for a 2-node cluster.
Can you review the attached cluster.conf and see if any other 'settings' may be of use?

I found this reference:

- http://www.mail-archive.com/linux-cluster@redhat.com/msg06573.html

Also, mounting the fs with the nodiratime option and setting:

        <dlm plock_ownership="1" plock_rate_limit="0"/>
        <gfs_controld plock_rate_limit="0"/>

in cluster.conf reduced rm -r time on the directory referenced below
from ~ 9 mins to ~ 5 mins.


Also, from dlm_controld man page, you have this setting to consider:

         <dlm>
           <lockspace name="foo" nodir="1">
             <master name="node01" weight="2"/>
             <master name="node02" weight="1"/>
           </lockspace>
         </dlm>

From your experiences, what could you suggest for other tuning considerations?

Comment 27 David Teigland 2009-07-22 15:44:11 UTC

gfs should always be mounted with noatime (nodiratime adds nothing more); atime hurts performance immensely.

The plock settings will have no effect on fs operations apart from file locking.

The dlm lockspace/master settings are for specialized configurations and can be ignored.  It was an area of research that never produced any general or practical results for users.

The attached cluster.conf looks fine (I can't really comment on the <rm> section), but I'd suggest a larger post_join_delay, perhaps 10 seconds, unless you've found a reason to keep it lower.

Comment 28 Lon Hohberger 2009-07-22 20:27:54 UTC


*** This bug has been marked as a duplicate of bug 474163 ***