Bug 152100

Summary:	recovery deadlock after having shot the master
Product:	[Retired] Red Hat Cluster Suite	Reporter:	Corey Marthaler <cmarthal>
Component:	gulm	Assignee:	Chris Feist <cfeist>
Status:	CLOSED ERRATA	QA Contact:	Cluster QE <mspqa-list>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3	CC:	cluster-maint, tao
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	RHBA-2006-0269	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2006-03-27 18:10:41 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2005-03-24 20:23:38 UTC

I had a 5 node cluster up and runing I/O to 3 gfs:
tank-01.lab.msp.redhat.com: Master
tank-02.lab.msp.redhat.com: Client
tank-03.lab.msp.redhat.com: Slave
tank-04.lab.msp.redhat.com: Client
tank-05.lab.msp.redhat.com: Slave

Revolver then shot the master and tank-03 became the new master. When trying to
bring tank-01 back up the gfs mounts hung.

.
.
.
getuid32()                              = 0
geteuid32()                             = 0
lstat64("/etc/mtab", {st_mode=S_IFREG|0644, st_size=184, ...}) = 0
stat64("/sbin/mount.gfs", 0xbfffa1e0)   = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_BLOCK, ~[TRAP SEGV RTMIN], NULL, 8) = 0
mount("/dev/pool/gfs2", "/mnt/gfs2", "gfs", 0xc0ed0000, 0

hung df trace:
.
.
.
statfs("/dev/shm", {f_type=0x1021994, f_bsize=4096, f_blocks=257257,
f_bfree=257257, f_bavail=257257, f_files=257257, f_ffree=257256, f_fsid={0, 0},
f_namelen=255, f_frsize=0}) = 0
stat64("/dev/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=40, ...}) = 0
open("/proc/mounts", O_RDONLY)          = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb73ef000
read(3, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 358
stat64("/dev/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=40, ...}) = 0
close(3)                                = 0
munmap(0xb73ef000, 4096)                = 0
write(1, "none                   1029028  "..., 65none                   1029028
        0   1029028   0% /dev/shm
) = 65
statfs("/mnt/gfs1",


[root@tank-03 root]# gulm_tool nodelist tank-03:core
 Name: tank-04.lab.msp.redhat.com
  ip    = 192.168.44.94
  state = Logged in
  mode = Client
  missed beats = 0
  last beat = 1111695443976809
  delay avg = 10007189
  max delay = 10021196

 Name: tank-02.lab.msp.redhat.com
  ip    = 192.168.44.92
  state = Logged in
  mode = Client
  missed beats = 0
  last beat = 1111695440705743
  delay avg = 10020195
  max delay = 10020766

 Name: tank-05.lab.msp.redhat.com
  ip    = 10.1.1.95
  state = Logged in
  mode = Slave
  missed beats = 0
  last beat = 1111695447499879
  delay avg = 10002520
  max delay = 992608823

 Name: tank-01.lab.msp.redhat.com
  ip    = 10.1.1.91
  state = Logged in
  mode = Slave
  missed beats = 0
  last beat = 1111695441186751
  delay avg = 10000296
  max delay = 10020731

 Name: tank-03.lab.msp.redhat.com
  ip    = 10.1.1.93
  state = Logged in
  mode = Master
  missed beats = 0
  last beat = 1111695443976819
  delay avg = 10012718
  max delay = 992588286


[root@tank-03 root]# gulm_tool getstats tank-01
I_am = Slave
Master = tank-03.lab.msp.redhat.com
rank = 0
quorate = true
GenerationID = 1111683367067929
run time = 11106
pid = 3012
verbosity = Default
failover = enabled
locked = 1
[root@tank-03 root]# gulm_tool getstats tank-02
I_am = Client
Master = tank-03.lab.msp.redhat.com
rank = -1
quorate = true
GenerationID = 1111683367067929
run time = 11535
pid = 7284
verbosity = Default
failover = enabled
locked = 1
[root@tank-03 root]# gulm_tool getstats tank-03
I_am = Master
quorum_has = 3
quorum_needs = 2
rank = 1
quorate = true
GenerationID = 1111683367067929
run time = 12344
pid = 7279
verbosity = Default
failover = enabled
locked = 1
[root@tank-03 root]# gulm_tool getstats tank-04
I_am = Client
Master = tank-03.lab.msp.redhat.com
rank = -1
quorate = true
GenerationID = 1111683367067929
run time = 12344
pid = 7279
verbosity = Default
failover = enabled
locked = 1
[root@tank-03 root]# gulm_tool getstats tank-05
I_am = Slave
Master = tank-03.lab.msp.redhat.com
rank = 2
quorate = true
GenerationID = 1111683367067929
run time = 12346
pid = 6857
verbosity = Default
failover = enabled
locked = 1

I also saw these errors on tank-04 just after tank-03 fenced tank-01:
Mar 24 11:09:45 tank-04 lock_gulmd_core[7279]: Failed to receive a timely
heartbeat reply from Master. (t:1111684185624678 mb:1)
Mar 24 11:10:00 tank-04 lock_gulmd_core[7279]: Failed to receive a timely
heartbeat reply from Master. (t:1111684200634698 mb:2)
Mar 24 11:10:15 tank-04 lock_gulmd_core[7279]: Failed to receive a timely
heartbeat reply from Master. (t:1111684215654679 mb:3)
Mar 24 11:10:30 tank-04 lock_gulmd_core[7279]: Timeout (15000000) on fd:5
(tank-01.lab.msp.redhat.com:10.1.1.91)
lock_gulm: Checking for journals for node "tank-01.lab.msp.redhat.com"
Mar 24 11:10:49 tank-04 lock_gulmd_core[7279]: Found Master at
tank-03.lab.msp.redhat.com, so I'm a Client.
Mar 24 11:10:49 tank-04 lock_gulmd_core[7279]: Failed to receive a timely
heartbeat reply from Master. (t:1111684249348620 mb:1)
Mar 24 11:10:49 tank-04 lock_gulmd_LTPX[7281]: New Master at
tank-03.lab.msp.redhat.com:10.1.1.93
Mar 24 11:10:49 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:10:49 tank-04 kernel: lock_gulm: Checking for journals for node
"tank-01.lab.msp.redhat.com"
Mar 24 11:10:56 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
lock_gulm: Checking for journals for node "tank-01.lab.msp.redhat.com"
Mar 24 11:10:58 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:10:59 tank-04 kernel: lock_gulm: Checking for journals for node
"tank-01.lab.msp.redhat.com"
Mar 24 11:11:00 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:11:31 tank-04 last message repeated 30 times
Mar 24 11:12:32 tank-04 last message repeated 56 times
Mar 24 11:13:24 tank-04 last message repeated 46 times
Mar 24 11:13:24 tank-04 kernel: scsi(0) qla2x00_isr MBA_PORT_UPDATE ignored
Mar 24 11:13:25 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:13:26 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:13:26 tank-04 kernel: scsi(0): RSCN database changed -0x68,0x0.
Mar 24 11:13:26 tank-04 kernel: scsi(0): Waiting for LIP to complete...
Mar 24 11:13:26 tank-04 kernel: scsi(0): Topology - (F_Port), Host Loop address
0xffff
Mar 24 11:13:27 tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed
Mar 24 11:13:59 tank-04 last message repeated 28 times
Mar 24 11:15:00 tank-04 last message repeated 58 times
Mar 24 11:16:01 tank-04 last message repeated 49 times
Mar 24 11:17:03 tank-04 last message repeated 46 times
Mar 24 11:18:04 tank-04 last message repeated 48 times
.
.
.

Version-Release number of selected component (if applicable):
[root@tank-03 root]# lock_gulmd -V
lock_gulmd 6.0.2.12 (built Mar 23 2005 15:34:18)
Copyright (C) 2004 Red Hat, Inc.  All rights reserved.

Comment 1 Corey Marthaler 2005-03-24 20:44:11 UTC

[root@tank-03 root]# gulm_tool getstats tank-03:lt000
I_am = Master
run time = 13053
pid = 7280
verbosity = Default
id = 0
partitions = 1
out_queue = 0
drpb_queue = 22
locks = 35226
unlocked = 8031
exclusive = 9810
shared = 17385
deferred = 0
lvbs = 1902
expired = 6359
lock ops = 1807295
conflicts = 30
incomming_queue = 0
conflict_queue = 27
reply_queue = 0
free_locks = 54774
free_lkrqs = 49
used_lkrqs = 49
free_holders = 89266
used_holders = 40734
highwater = 1048576


I also grabed a lock dump from tank-01 and tank-03 and put them in:
/home/msp/cmarthal/pub/bugs/152100

Comment 2 Corey Marthaler 2005-03-24 20:46:55 UTC

one last thing, Gulm was using the usedev interface:

[root@tank-02 root]# cat tank-cluster.cca
#CCA:tank-cluster
#nigeb=cluster.ccs mtime=1111531650 size=146
cluster{
        name="tank-cluster"
        lock_gulm{
               
servers=["tank-01.lab.msp.redhat.com","tank-03.lab.msp.redhat.com","tank-05.lab.msp.redhat.com"]
        }
}
#dne=cluster.ccs hash=348D0569
#nigeb=nodes.ccs mtime=1111608446 size=864
nodes{
        tank-01.lab.msp.redhat.com{
                ip_interfaces {
                        eth1="10.1.1.91"
                }
                usedev = "eth1"
                fence{
                        fence1{
                                tank-apc{
                                        port="1"
                                        switch="1"
                                }
                        }
                }
        }
        tank-02.lab.msp.redhat.com{
                ip_interfaces {
                        eth1="10.1.1.92"
                }
                usedev = "eth1"
                fence{
                        fence1{
                                tank-apc{
                                        port="2"
                                        switch="1"
                                }
                        }
                }
        }
        tank-03.lab.msp.redhat.com{
                ip_interfaces {
                        eth1="10.1.1.93"
                }
                usedev = "eth1"
                fence{
                        fence1{
                                tank-apc{
                                        port="3"
                                        switch="1"
                                }
                        }
                }
        }
        tank-04.lab.msp.redhat.com{
                ip_interfaces {
                        eth1="10.1.1.94"
                }
                usedev = "eth1"
                fence{
                        fence1{
                                tank-apc{
                                        port="4"
                                        switch="1"
                                }
                        }
                }
        }
        tank-05.lab.msp.redhat.com{
                ip_interfaces {
                        eth1="10.1.1.95"
                }
                usedev = "eth1"
                fence{
                        fence1{
                                tank-apc{
                                        port="5"
                                        switch="1"
                                }
                        }
                }
        }
}
#dne=nodes.ccs hash=9FD1F953
#nigeb=fence.ccs mtime=1111531650 size=105
fence_devices{
        tank-apc{
                agent="fence_apc"
                ipaddr="192.168.47.32"
                login="apc"
                passwd="apc"
        }
}
#dne=fence.ccs hash=B2725FC6

Comment 3 Adam "mantis" Manthei 2005-03-24 21:23:13 UTC

I've been seeing this lock up in my tests as well.

The lock dumps show that there are extra locks hanging around for a node after
That node has been expired.  I've not been able to figure it out as to what's
going on in my environment yet.

My setups had a few more nodes.  There were ~13 clients that each mounted two
GFS filesystems (gfs1 and gfs2).  I had three dedicated gulm server nodes. 
Sometimes I coulf get the lockup to happen right away, other times it took a
while (~days)

I did not have ip_interfaces{} or usedev sections in my nodes.ccs files.

Comment 4 michael conrad tadpol tilstra 2005-03-29 14:39:50 UTC

well, the "tank-04 lock_gulmd_LTPX[7281]: Errors trying to login to LT000:
(tank-03.lab.msp.redhat.com:10.1.1.93) 1006:Not Allowed" message is why things
aren't mounting.  No locks.  just gotta figure out why....

Comment 5 michael conrad tadpol tilstra 2005-03-29 16:22:01 UTC

well, in corey's case, I'm gonna bet this has something to do with the config
saying: tank-04.lab.msp.redhat.com = 10.1.1.94, but gulm_tool nodelist saying:
tank-04.lab.msp.redhat.com = 192.168.44.94

looking into it.

Comment 6 michael conrad tadpol tilstra 2005-03-29 16:45:00 UTC

so, corey said that 10.1.1.* can ping 192.168.44.* and vica-versa.  Which now
means that gulm_core can go from 192.168.44.* -> 10.1.1.* and gulm_LT can try to
connect from 10.1.1.* to 10.1.1.*  Which will confuse the hell out of gulm and
give the above errors.  So this might be the problem here.  I'm not quite sure
how this all works out. Need to poke about more.

Comment 7 michael conrad tadpol tilstra 2005-03-29 17:19:27 UTC

What did tank-04 report as its IP?  (grep log for 'I am ')

Comment 8 Corey Marthaler 2005-03-29 17:26:36 UTC

Those machines have since been reimaged to RHEL4, all log info has been 
lost, :( but I'm almost positive I saw it was using the 10.1.1.94. 
 
tank-04 would have had: 
 
/etc/sysconfig/network-scripts/ifcfg-eth1 
TYPE="Ethernet" 
DEVICE="eth1" 
ONBOOT="yes" 
BOOTPROTO="static" 
IPADDR="10.1.1.94" 
NETMASK="255.255.255.0" 
 
/etc/hosts 
127.0.0.1               localhost.localdomain localhost 
10.1.1.91       tank-01-pvt.lab.map.redhat.com  tank-01-pvt 
10.1.1.92       tank-02-pvt.lab.map.redhat.com  tank-02-pvt 
10.1.1.93       tank-03-pvt.lab.map.redhat.com  tank-03-pvt 
10.1.1.94       tank-04-pvt.lab.map.redhat.com  tank-04-pvt 
10.1.1.95       tank-05-pvt.lab.map.redhat.com  tank-05-pvt

Comment 9 michael conrad tadpol tilstra 2005-03-30 14:52:18 UTC

possibly fixed.  binding outgoing sockets to the address we think we should have.

Comment 10 Chris Feist 2005-10-19 01:00:45 UTC

It looks like this bug isn't fixed yet, re-opening.

Comment 14 Chris Feist 2005-11-07 20:55:04 UTC

This has been committted and should be in the next release of GFS for RHEL3.

Comment 17 Red Hat Bugzilla 2006-03-27 18:10:41 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0269.html