Bug 561226 - clvmd hangs during stress load
Summary: clvmd hangs during stress load
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2-cluster
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Petr Rockai
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 559999 561227
Blocks: 584706
TreeView+ depends on / blocked
 
Reported: 2010-02-03 04:10 UTC by Perry Myers
Modified: 2016-04-26 15:23 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 559999
Environment:
Last Closed: 2011-01-13 22:42:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kern dump from taft-01 (319.18 KB, text/plain)
2010-10-12 17:06 UTC, Corey Marthaler
no flags Details
kern dump from taft-02 (316.82 KB, text/plain)
2010-10-12 17:07 UTC, Corey Marthaler
no flags Details
kern dump from taft-03 (315.87 KB, text/plain)
2010-10-12 17:07 UTC, Corey Marthaler
no flags Details
kern dump from taft-04 (317.21 KB, text/plain)
2010-10-12 17:07 UTC, Corey Marthaler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0053 0 normal SHIPPED_LIVE lvm2-cluster bug fix and enhancement update 2011-01-12 17:22:43 UTC

Description Perry Myers 2010-02-03 04:10:44 UTC
+++ This bug was initially created as a clone of Bug #559999 +++

Description of problem:
clvmd hangs and stop operations when it's under stress load for couple of mins

Version-Release number of selected component (if applicable):
cman-1.0.27-1.el4
cman-kernel-2.6.9-56.7.el4_8.9


How reproducible:
stress load script:
#!/bin/bash

while  true; do 
        lvs; 
done > /dev/null &

while true; do
        echo -n '.';
        vgscan > /dev/null;
        sleep $(($RANDOM%7))
done



Steps to Reproduce:
1. form a cluster with 3 nodes
2. start 1-6 instances of the script on each node (sometimes more is better)
3. wait for dots to stop
  
Actual results:
At some time message about broken pipe is received and some time later all the clvmd processes stop and are probably waiting for something.

This can result in segmentation fault or infinite waits. The result varies.

Expected results:
clvmd processes all the requests coming and no hang occurs

Additional info:

--- Additional comment from ccaulfie on 2010-01-29 10:47:43 EST ---

More info:

The main thread is waiting in pthread_join() for this thread:

(gdb) bt
#0  0x0000003e6bc08d1a in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/tls/libpthread.so.0
#1  0x000000000040e96a in pre_and_post_thread (arg=Variable "arg" is not available.
) at clvmd.c:1453
#2  0x0000003e6bc06317 in start_thread () from /lib64/tls/libpthread.so.0
#3  0x0000003e6b3c9f03 in clone () from /lib64/tls/libc.so.6
(gdb) l 1450
1445			}
1446	
1447			/* We may need to wait for the condition variable before running the post command */
1448			pthread_mutex_lock(&client->bits.localsock.mutex);
1449			DEBUGLOG("Waiting to do post command - state = %d\n",
1450				 client->bits.localsock.state);
1451	
1452			if (client->bits.localsock.state != POST_COMMAND) {
1453				pthread_cond_wait(&client->bits.localsock.cond,
1454						  &client->bits.localsock.mutex);
(gdb) p client->bits.localsock.state
$7 = PRE_COMMAND
(gdb) 


So it looks like it might be some sort of thread race.

lvm2-cluster-2.02.42-5.el4

--- Additional comment from ccaulfie on 2010-02-02 05:00:00 EST ---

Created an attachment (id=388236)
Patch for testing

This is the patch I have given to Jaroslav for testing, as I can't reproduce this myself.

--- Additional comment from ccaulfie on 2010-02-02 05:40:30 EST ---

It's also worth mentioning that this bug will exist in RHEL5 and RHEL6 too as the code is generic.

Comment 1 Christine Caulfield 2010-04-07 08:10:28 UTC
Assign to mbroz for packaging as the patch is now in CVS.

Comment 2 Milan Broz 2010-04-21 18:43:21 UTC
Fix in lvm2-cluster-2.02.56-10.el5.

Comment 6 Corey Marthaler 2010-10-12 16:56:43 UTC
This doesn't appear to be fixed. 

I ran five instances of the script mentioned in comment #0 on all the nodes in my four node cluster. Those scripts continued to make progress for about 12 hours at which point the vgscan cmds hung. The lvs cmds however continued to run however.

I'll attach kern dumps from each of the systems.

TAFT-01:
[root@taft-01 ~]# ps aux | grep vgs
root     10219  0.0  0.0  77944  1704 pts/3    S+   04:38   0:00 vgscan
root     10438  0.0  0.0  77940  1372 pts/4    S+   04:38   0:00 vgscan
root     10467  0.0  0.0  77940  1368 pts/1    S+   04:38   0:00 vgscan
root     10543  0.0  0.0  77940  1368 pts/2    S+   04:38   0:00 vgscan
root     10694  0.0  0.0  77940  1368 pts/0    S+   04:38   0:00 vgscan
root     12045  0.0  0.0  61212   748 pts/7    S+   11:52   0:00 grep vgs

[root@taft-01 ~]# ps -elf | grep vgs
4 S root     10219 11012  0  75   0 - 19486 -      04:38 pts/3    00:00:00 vgscan
0 S root     10438 12880  0  80   0 - 19485 -      04:38 pts/4    00:00:00 vgscan
0 S root     10467  9594  0  78   0 - 19485 -      04:38 pts/1    00:00:00 vgscan
0 S root     10543 10352  0  80   0 - 19485 -      04:38 pts/2    00:00:00 vgscan
0 S root     10694  8411  0  77   0 - 19485 -      04:38 pts/0    00:00:00 vgscan
0 S root     12356 14973  0  78   0 - 15304 pipe_w 11:52 pts/7    00:00:00 grep vgs

[root@taft-01 ~]# strace -p 10219
Process 10219 attached - interrupt to quit
read(3,

TAFT-04:
[root@taft-04 ~]# ps -elf | grep vgs
0 S root      6102 22608  0  77   0 - 19491 -      04:37 pts/0    00:00:00 vgscan
0 S root      6127 23827  0  80   0 - 19491 -      04:37 pts/1    00:00:00 vgscan
0 S root      6345 24540  0  78   0 - 19491 -      04:38 pts/2    00:00:00 vgscan
0 S root      6426 25180  0  80   0 - 19491 -      04:38 pts/3    00:00:00 vgscan
0 S root      6427 26998  0  78   0 - 19491 -      04:38 pts/4    00:00:00 vgscan
0 S root     12572 12548  0  78   0 - 15310 pipe_w 11:54 pts/7    00:00:00 grep vgs
[root@taft-04 ~]# ps aux | grep vgs
root      6102  0.0  0.0  77964  1376 pts/0    S+   04:37   0:00 vgscan
root      6127  0.0  0.0  77964  1376 pts/1    S+   04:37   0:00 vgscan
root      6345  0.0  0.0  77964  1376 pts/2    S+   04:38   0:00 vgscan
root      6426  0.0  0.0  77964  1372 pts/3    S+   04:38   0:00 vgscan
root      6427  0.0  0.0  77964  1368 pts/4    S+   04:38   0:00 vgscan
root     12576  0.0  0.0  61240   760 pts/7    S+   11:54   0:00 grep vgs

[root@taft-04 ~]# strace -p 6427
Process 6427 attached - interrupt to quit
connect(3, {sa_family=AF_FILE, path="/var/run/lvm/clvmd.sock"...}, 110



2.6.18-225.el5

lvm2-2.02.73-2.el5    BUILT: Mon Aug 30 06:36:20 CDT 2010
lvm2-cluster-2.02.73-2.el5    BUILT: Mon Aug 30 06:38:05 CDT 2010
device-mapper-1.02.54-2.el5    BUILT: Fri Sep 10 12:00:05 CDT 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 7 Corey Marthaler 2010-10-12 17:06:39 UTC
Created attachment 452996 [details]
kern dump from taft-01

Comment 8 Corey Marthaler 2010-10-12 17:07:05 UTC
Created attachment 452997 [details]
kern dump from taft-02

Comment 9 Corey Marthaler 2010-10-12 17:07:28 UTC
Created attachment 452998 [details]
kern dump from taft-03

Comment 10 Corey Marthaler 2010-10-12 17:07:50 UTC
Created attachment 452999 [details]
kern dump from taft-04

Comment 12 Corey Marthaler 2010-10-21 21:10:05 UTC
Running test instances over the weekend with the latest scratch build http://download.devel.redhat.com/brewroot/scratch/mbroz/task_2838441. Should have an update on this Monday.

Comment 13 Corey Marthaler 2010-10-25 19:02:47 UTC
The vgscan commands still eventually end up deadlocking with the test rpms in comment #12.

Comment 14 Petr Rockai 2010-10-26 17:12:55 UTC
Hm. One thing that puzzles me somewhat: are the lvs commands still making progress? It is puzzling, because according to your strace, vgscan is hanging trying to connect to clvmd. If lvs is connecting to clvmd, it should experience the same hang. And it should be, as far as I can tell?

Comment 15 Petr Rockai 2010-10-27 09:01:07 UTC
I have tracked down and fixed another deadlock in clvmd. It is consistent with the logs here in the sense that all the clvmd threads are hanging on futexes apart from one, which is hanging in vfs_read. I will merge the patch now and hopefully Milan can make a new test build with that patch. Even though the vfs_read is hopeful, there may be other deadlocks... nevertheless, testing it can't hurt.

Comment 18 Corey Marthaler 2010-10-28 18:17:02 UTC
The latest build appears to fix this deadlock. The scripts in comment #0 have been running w/o issues for almost 24 hours.

2.6.18-227.el5

lvm2-2.02.74-1.el5    BUILT: Fri Oct 15 10:26:21 CDT 2010
lvm2-cluster-2.02.74-1.6.el5    BUILT: Wed Oct 27 04:57:56 CDT 2010
device-mapper-1.02.55-1.el5    BUILT: Fri Oct 15 06:15:55 CDT 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 19 Petr Rockai 2010-10-28 21:26:53 UTC
Nice. The fix that Corey has been testing is already upstream, going for POST.

Comment 20 Milan Broz 2010-10-29 12:40:49 UTC
Additional fixes in lvm2-cluster-2.02.74-2.el5.

Comment 22 Corey Marthaler 2010-11-04 14:18:00 UTC
Multiple instances of the this test script ran all night. Marking verified in the current rpms.

2.6.18-227.el5

lvm2-2.02.74-1.el5    BUILT: Fri Oct 15 10:26:21 CDT 2010
lvm2-cluster-2.02.74-2.el5    BUILT: Fri Oct 29 07:48:11 CDT 2010
device-mapper-1.02.55-1.el5    BUILT: Fri Oct 15 06:15:55 CDT 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 24 errata-xmlrpc 2011-01-13 22:42:36 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0053.html


Note You need to log in before you can comment on or make changes to this bug.