Bug 207144

Summary:

rgmanager stuck in join/update state (cause: I/O wait on stdout/err)

Product:

[Retired] Red Hat Cluster Suite

Reporter:

Lenny Maiorani <lenny>

Component:

rgmanager

Assignee:

Lon Hohberger <lhh>

Status:

CLOSED ERRATA

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

CC:

cluster-maint

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

RHBA-2007-0149

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2007-05-10 21:17:37 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Switches logging in rgmanager / clulog utility to use non-blocking output for clulog() calls	none
stdout and stderr from rgmanager	none
Use nonblocking output and also do not block in recv() in the main thread	none

Description Lenny Maiorani 2006-09-19 17:54:56 UTC

Description of problem:
After a fence operation occurs, rgmanager does not complete joining the cluster
and take over VIPs.


Version-Release number of selected component (if applicable):
rgmanager-1.9.53-0


How reproducible:
Often


Steps to Reproduce:
1. cause a node to get fenced
2. when the node attempts to rejoin the cluster, rgmanager is started, but never
takes over VIPs. 
3. clustat times out communicating with rgmanager.

  
Expected results:
VIPs are taken over by recovered node

Additional info:

output from /proc/cluster/services on each of the 3 nodes:
node 1 
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       U-1,10,2
[1 3]

DLM Lock Space:  "clvmd"                             2   3 run       U-1,10,2
[1 3]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[1 3]

DLM Lock Space:  "V1"                                5   6 run       -
[1 3]

DLM Lock Space:  "Snaporig"                          7   8 run       -
[1 3]

DLM Lock Space:  "Volume2"                           9  10 run       -
[1 3]

DLM Lock Space:  "Magma"                            12  13 run       -
[1]

GFS Mount Group: "crosswalk"                         4   5 run       -
[1 3]

GFS Mount Group: "V1"                                6   7 run       -
[1 3]

GFS Mount Group: "Snaporig"                          8   9 run       -
[1 3]

GFS Mount Group: "Volume2"                          10  11 run       -
[1 3]

User:            "usrm::manager"                    11  12 recover 2 -

Node 2 (fenced node)
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 join      S-4,4,2
[1 3 2]

DLM Lock Space:  "clvmd"                             2   3 join      S-4,4,2
[1 3 2]


Node 3
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 update    U-2,0,2
[1 3]

DLM Lock Space:  "clvmd"                             2   3 update    U-2,0,2
[1 3]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[1 3]

DLM Lock Space:  "V1"                                5   6 run       -
[1 3]

DLM Lock Space:  "Snaporig"                          7   8 run       -
[1 3]

DLM Lock Space:  "Volume2"                           9  10 run       -
[1 3]

GFS Mount Group: "crosswalk"                         4   5 run       -
[1 3]

GFS Mount Group: "V1"                                6   7 run       -
[1 3]

GFS Mount Group: "Snaporig"                          8   9 run       -
[1 3]

GFS Mount Group: "Volume2"                          10  11 run       -
[1 3]

User:            "usrm::manager"                     0  12 join      S-1,80,3
[]





In another case /proc/cluster/services:

[root@lizard01 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[2 3 1]

DLM Lock Space:  "V1"                                5   6 run       -
[2 3 1]

DLM Lock Space:  "Snaporig"                          7   8 run       -
[2 3 1]

DLM Lock Space:  "Volume2"                           9  10 run       -
[2 3 1]

GFS Mount Group: "crosswalk"                         4   5 run       -
[2 3 1]

GFS Mount Group: "V1"                                6   7 run       -
[2 3 1]

GFS Mount Group: "Snaporig"                          8   9 run       -
[2 3 1]

GFS Mount Group: "Volume2"                          10  11 run       -
[2 3 1]

User:            "usrm::manager"                    11  12 join      S-8,40,3
[2 3 1]



Lizard02 

[root@lizard02 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[2 3 1]

DLM Lock Space:  "V1"                                5   6 run       -
[2 3 1]

DLM Lock Space:  "Snaporig"                          7   8 run       -
[2 3 1]

DLM Lock Space:  "Volume2"                           9  10 run       -
[2 3 1]

DLM Lock Space:  "Magma"                            12  13 run       -
[2 3]

GFS Mount Group: "crosswalk"                         4   5 run       -
[2 3 1]

GFS Mount Group: "V1"                                6   7 run       -
[2 3 1]

GFS Mount Group: "Snaporig"                          8   9 run       -
[2 3 1]

GFS Mount Group: "Volume2"                          10  11 run       -
[2 3 1]

User:            "usrm::manager"                    11  12 update    U-4,1,1
[2 3 1]

Lizard03
[root@lizard03 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[3 2 1]

DLM Lock Space:  "V1"                                5   6 run       -
[3 2 1]

DLM Lock Space:  "Snaporig"                          7   8 run       -
[3 2 1]

DLM Lock Space:  "Volume2"                           9  10 run       -
[3 2 1]

DLM Lock Space:  "Magma"                            12  13 run       -
[2 3]

GFS Mount Group: "crosswalk"                         4   5 run       -
[3 2 1]

GFS Mount Group: "V1"                                6   7 run       -
[3 2 1]

GFS Mount Group: "Snaporig"                          8   9 run       -
[3 2 1]

GFS Mount Group: "Volume2"                          10  11 run       -
[3 2 1]

User:            "usrm::manager"                    11  12 update    U-6,2,1
[2 3 1]





And another:

Dragon01
[root@dragon01 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 3 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[2 3 1]

DLM Lock Space:  "V1"                                7   6 run       -
[2 3 1]

GFS Mount Group: "crosswalk"                         4   5 run       -
[2 3 1]

GFS Mount Group: "V1"                                8   7 run       -
[2 3 1]

User:            "usrm::manager"                     5   8 join      S-8,40,3
[2 3 1]

Dragon02
[root@dragon02 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 3 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[3 2 1]

DLM Lock Space:  "Magma"                             6   7 run       -
[3 2]

DLM Lock Space:  "clvmd"                             2   9 run       -
[3 2 1]

DLM Lock Space:  "V1"                                7  10 run       -
[2 3 1]

GFS Mount Group: "crosswalk"                         4   5 run       -
[3 2 1]

GFS Mount Group: "V1"                                8  11 run       -
[2 3 1]

User:            "usrm::manager"                     5   6 update    U-4,1,1
[3 2 1]


Dragon03
[root@dragon03 ~]# cat /proc/cluster/services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[3 2 1]

DLM Lock Space:  "crosswalk"                         3   4 run       -
[3 2 1]

DLM Lock Space:  "Magma"                             6   7 run       -
[3 2]

DLM Lock Space:  "clvmd"                             2   9 run       -
[3 2 1]

DLM Lock Space:  "V1"                                7  10 run       -
[3 2 1]

GFS Mount Group: "crosswalk"                         4   5 run       -
[3 2 1]

GFS Mount Group: "V1"                                8  11 run       -
[3 2 1]

User:            "usrm::manager"                     5   6 update    U-4,1,1
[3 2 1]

Comment 1 Lenny Maiorani 2006-09-20 22:01:00 UTC

More info possibly related:

frii01 was fenced by frii02. When frii01 came back up rgmanager on frii02 never
completed marking frii01 as joining. When frii02 rgmanager was restarted it
joined properly. Here are some logs from /var/log/messages...

Sep 14 21:20:29 frii01 clurgmgrd[15602]: <info> Logged in SG "usrm::manager" 
Sep 14 21:20:29 frii01 clurgmgrd[15602]: <info> Magma Event: Membership Change 
Sep 14 21:20:29 frii01 clurgmgrd[15602]: <info> State change: Local UP 
Sep 14 21:20:29 frii01 clurgmgrd[15602]: <info> State change: frii02 UP 
Sep 14 21:20:30 frii01 clurgmgrd[15602]: <info> Magma Event: Membership Change 
Sep 14 21:20:30 frii01 clurgmgrd[15602]: <info> State change: frii02 DOWN 


Sep 14 18:54:36 frii02 clurgmgrd[12763]: <info> Magma Event: Membership Change 
Sep 14 18:54:36 frii02 clurgmgrd[12763]: <info> State change: frii01 DOWN 

Sep 14 21:20:17 frii02 clurgmgrd[12763]: <err> #37: Error receiving message
header 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Shutting down 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.107.225 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.106.225 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.105.225 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.104.225 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.106.225/24 from eth0.106 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.103.225 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.107.225/24 from eth0.107 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.102.225 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.105.225/24 from eth0.105 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service
172.16.101.225 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.104.225/24 from eth0.104 
Sep 14 21:20:17 frii02 clurgmgrd[12763]: <notice> Stopping service frii-mgmt 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.102.225/24 from eth0.102 
Sep 14 21:20:17 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.103.225/24 from eth5 
Sep 14 21:20:18 frii02 clurgmgrd[12763]: <notice> Stopping service snapshot 
Sep 14 21:20:18 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.101.225/24 from eth0.101 
Sep 14 21:20:18 frii02 clurgmgrd: [12763]: <info> Removing IPv4 address
172.16.103.226/24 from eth0.103 
Sep 14 21:20:18 frii02 clurgmgrd[12763]: <notice> Stopping service
email_notifier 
Sep 14 21:20:18 frii02 clurgmgrd: [12763]: <info> Executing
/mnt/crosswalk/snapshot/run_snap stop 
Sep 14 21:20:18 frii02 clurgmgrd[12763]: <notice> Stopping service nvbackup 
Sep 14 21:20:18 frii02 clurgmgrd: [12763]: <info> Executing
/opt/crosswalk/email_notifier/run_email_notifier stop 
Sep 14 21:20:18 frii02 clurgmgrd[12763]: <notice> Service snapshot is stopped 
Sep 14 21:20:18 frii02 clurgmgrd: [12763]: <info> Executing
/mnt/igridbackup/nvbackup/nvmain stop 
Sep 14 21:20:18 frii02 clurgmgrd[12763]: <notice> Service email_notifier is
stopped 
Sep 14 21:20:19 frii02 clurgmgrd[12763]: <notice> Service nvbackup is stopped 
Sep 14 21:20:27 frii02 clurgmgrd[12763]: <notice> Service 172.16.106.225 is
stopped 
Sep 14 21:20:27 frii02 clurgmgrd[12763]: <notice> Service 172.16.107.225 is
stopped 
Sep 14 21:20:27 frii02 clurgmgrd[12763]: <notice> Service 172.16.105.225 is
stopped 
Sep 14 21:20:28 frii02 clurgmgrd[12763]: <notice> Service 172.16.104.225 is
stopped 
Sep 14 21:20:28 frii02 clurgmgrd[12763]: <notice> Service 172.16.102.225 is
stopped 
Sep 14 21:20:28 frii02 clurgmgrd[12763]: <notice> Service 172.16.103.225 is
stopped 
Sep 14 21:20:28 frii02 clurgmgrd[12763]: <notice> Service 172.16.101.225 is
stopped 
Sep 14 21:20:28 frii02 clurgmgrd[12763]: <notice> Service frii-mgmt is stopped 
Sep 14 21:20:31 frii02 clurgmgrd[12763]: <notice> Shutdown complete, exiting 



Side note:
The kernel is 2.6.9-34 with a patch recently introduced from bz #199673.

Comment 2 Lon Hohberger 2006-09-21 15:41:48 UTC

Thanks for all of the information.  I'm wondering if this is just a simple join
synchronization problem in the SM magma plugin.

Comment 4 Lenny Maiorani 2006-09-22 21:50:52 UTC

Well, it is something we are seeing VERY regularly after a fence.

Another information point... we changed deadnode_timeout to 64 and hello_timer to 5.

Comment 5 Lenny Maiorani 2006-09-27 20:40:35 UTC

Figured out that one of the services being run by rgmanager was printing to
STDERR and STDOUT. This was causing the rgmanager some problems. As soon as we
took the printf's out of that service this problem stopped.

I guess the bug isn't really that rgmanager doesn't start up, but rather is is
not properly handling an interrupt.

Comment 6 Lon Hohberger 2006-09-28 18:04:26 UTC

Created attachment 137328 [details]
Switches logging in rgmanager / clulog utility to use non-blocking output for clulog() calls

Comment 7 Lenny Maiorani 2006-10-02 20:06:30 UTC

Created attachment 137594 [details]
stdout and stderr from rgmanager

New info: We are able to reproduce this by doing 2 consecutive fences to the
same node. Attached are some logs of STDOUT and STDERR from rgmanager on each
node.

Story: node1 was fenced and node2 is the failover domain for all node1
services. rgmanager on node1 is shown to be started by clustat, is still in the
"join" state according to /proc/cluster/services.

Comment 8 Lon Hohberger 2006-10-04 21:11:49 UTC

Created attachment 137773 [details]
Use nonblocking output and also do not block in recv() in the main thread

Comment 9 Lenny Maiorani 2006-10-04 21:15:02 UTC

Fenced node1 and when attempting to rejoin, we are stuck. Here are backtraces
and /proc/cluster/services info from each node:

node 1:
#0  0x0000003162bbf0a6 in __select_nocancel () from /lib64/tls/libc.so.6
#1  0x0000002a9558be30 in sm_wait_join_complete (p=0x542e00) at sm.c:242
#2  0x0000002a9558c1f7 in sm_login (self=0x542d20, fd=4, 
    groupname=0x4173d0 "usrm::manager") at sm.c:362
#3  0x00000033e5503930 in cp_login (cpp=0x542d20, fd=4, 
    groupname=0x4173d0 "usrm::manager") at plugin.c:708
#4  0x00000033e5502945 in clu_login (fd=4, groupname=0x4173d0 "usrm::manager")
    at global.c:295
#5  0x0000000000406489 in main (argc=2, argv=0x7fbffffd58) at main.c:814
#6  0x0000003162b1c4bb in __libc_start_main () from /lib64/tls/libc.so.6
#7  0x0000000000403a3a in _start ()
#8  0x0000007fbffffd48 in ?? ()
#9  0x000000000000001c in ?? ()
#10 0x0000000000000002 in ?? ()
#11 0x0000007fbffffeeb in ?? ()
#12 0x0000007fbffffeff in ?? ()
#13 0x0000000000000000 in ?? ()

No /proc/cluster/services info for usrm



Node 2:
#0  0x0000003c1560b52a in recv () from /lib64/tls/libpthread.so.0
#1  0x00000039c9a02539 in msg_peek (sockfd=12, buf=0x7fbffffaf0, count=20)
    at message.c:853
#2  0x00000000004055a6 in dispatch_msg (fd=12, nodeid=2) at main.c:346
#3  0x0000000000405de2 in event_loop (clusterfd=4) at main.c:548
#4  0x0000000000406533 in main (argc=2, argv=0x7fbffffd58) at main.c:839
#5  0x0000003c14d1c4bb in __libc_start_main () from /lib64/tls/libc.so.6
#6  0x0000000000403a3a in _start ()
#7  0x0000007fbffffd48 in ?? ()
#8  0x000000000000001c in ?? ()
#9  0x0000000000000002 in ?? ()
#10 0x0000007fbffffeeb in ?? ()
#11 0x0000007fbffffeff in ?? ()
#12 0x0000000000000000 in ?? ()

User:            "usrm::manager"                    16  19 update    U-4,1,1
[3 2 1]



Node 3:
#0  0x0000003f5af0b52a in recv () from /lib64/tls/libpthread.so.0
#1  0x0000003e92802539 in msg_peek (sockfd=9, buf=0x7fbffffaf0, count=20)
    at message.c:853
#2  0x00000000004055a6 in dispatch_msg (fd=9, nodeid=3) at main.c:346
#3  0x0000000000405de2 in event_loop (clusterfd=4) at main.c:548
#4  0x0000000000406533 in main (argc=2, argv=0x7fbffffd58) at main.c:839
#5  0x0000003f5a61c4bb in __libc_start_main () from /lib64/tls/libc.so.6
#6  0x0000000000403a3a in _start ()
#7  0x0000007fbffffd48 in ?? ()
#8  0x000000000000001c in ?? ()
#9  0x0000000000000002 in ?? ()
#10 0x0000007fbffffeeb in ?? ()
#11 0x0000007fbffffeff in ?? ()
#12 0x0000000000000000 in ?? ()

User:            "usrm::manager"                    16  18 update    U-4,1,1
[3 2 1]

Comment 10 Lenny Maiorani 2006-10-05 17:48:56 UTC

This patch has solved the problem. Tested it a few times and the re-join happens
each time.

Comment 13 Red Hat Bugzilla 2007-05-10 21:17:37 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0149.html