Bug 709758 - corosync process at 100% - 500% CPU usage
corosync process at 100% - 500% CPU usage
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync (Show other bugs)
6.1
Unspecified Linux
urgent Severity urgent
: rc
: ---
Assigned To: Jan Friesse
Cluster QE
:
: 675568 (view as bug list)
Depends On:
Blocks: 717010
  Show dependency treegraph
 
Reported: 2011-06-01 11:02 EDT by Ricky Nelson
Modified: 2011-12-06 06:50 EST (History)
12 users (show)

See Also:
Fixed In Version: corosync-1.4.0-1.el6
Doc Type: Bug Fix
Doc Text:
Cause: A spinlock was used around I/O operations. Consequence: Corosync would consume extremely high CPU under heavy use of IPC operations because the spinlocks would spin during I/O. Fix: Changed spinlock to a mutex. Result: The mutex releases the processor from spinning but enforces correct behavior.
Story Points: ---
Clone Of:
: 717010 (view as bug list)
Environment:
Last Closed: 2011-12-06 06:50:53 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
gzipped core for corosync during moment of high CPU usage (770.95 KB, application/x-gzip)
2011-06-01 16:43 EDT, Ricky Nelson
no flags Details
core for corosync during moment of high CPU usage using test build (709.67 KB, application/x-gzip)
2011-06-03 14:42 EDT, Ricky Nelson
no flags Details
corosync-blackbox during high CPU usage of corosync (2.13 MB, text/plain)
2011-06-06 14:12 EDT, Ricky Nelson
no flags Details
corosync-fplay during high CPU usage of corosync (2.13 MB, text/plain)
2011-06-06 14:14 EDT, Ricky Nelson
no flags Details
corosync-blackbox from separate node during high CPU usage of corosync (2.07 MB, text/plain)
2011-06-06 14:15 EDT, Ricky Nelson
no flags Details
test file (72 bytes, application/octet-stream)
2011-06-06 23:05 EDT, Steven Dake
no flags Details
testall file (380 bytes, application/octet-stream)
2011-06-06 23:06 EDT, Steven Dake
no flags Details
testa file (39 bytes, application/octet-stream)
2011-06-06 23:07 EDT, Steven Dake
no flags Details
blackbox and objctl info captured at last high CPU event with the debug corosync packages (209.96 KB, application/x-gzip)
2011-06-07 14:46 EDT, Ricky Nelson
no flags Details
unsupported test build which may fix problem (2.24 MB, application/x-gzip)
2011-06-08 15:17 EDT, Steven Dake
no flags Details
Stupid threaded prime number detector (717 bytes, text/x-csrc)
2011-06-14 03:39 EDT, Jan Friesse
no flags Details
Final patch which removes spinlocks from corosync (7.61 KB, patch)
2011-06-29 10:23 EDT, Jan Friesse
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 65177 None None None Never

  None (edit)
Description Ricky Nelson 2011-06-01 11:02:35 EDT
Description of problem:
8 node cluster with dual quad-core xeon, 12 gigs ram each.

From time to time in an 8 node cluster, a random node's corosync process will use 100% - 400% CPU. During that time, the node is unresponsive. Services running on that node stales. Corosync.log shows nothing. After 60 to 120 seconds, the CPU usage dies down and the node continues to function normally.

Version-Release number of selected component (if applicable):
corosync-1.2.3-36.el6.x86_64

How reproducible:
Let 8-node cluster run with moderate load, and periodically the corosync process will show up on top/htop using 100%-400% CPU (multicore system).

Additional info:
We've tried utilizing the '-p' option under the cman's init script to have the corosync process start and not set process priority, but that didn't change the behavior.

We've got sosreports and thread dumps from when the high CPU usage occurred, but there's no corosync process listed within the thread dump. Would generating a core be useful for the corosync process?

Any additional thoughts or requests on this?
Comment 3 Steven Dake 2011-06-01 15:13:57 EDT
Ricky,

Need to know if customer is using pacemaker.

regarding comment #1, is the customer using pacemaker?  If so, the method of using the plugin often causes this problem and there appears to be no way to resolve it.  What would be really helpful is to do the following:

RE pacemaker, also see:
https://bugzilla.redhat.com/show_bug.cgi?id=707876

which results in pacemaker breakage without application of the patch.

When corosync pegs the cpus, kill -SEGV the corosync process
This will create a core file in /var/lib/corosync

The core file will tell us what is happening internally to create this locking problem.

Thanks!
-steve
Comment 4 Ricky Nelson 2011-06-01 15:28:40 EDT
(In reply to comment #3)

> Need to know if customer is using pacemaker.

The customer is not using pacemaker.

> 
> When corosync pegs the cpus, kill -SEGV the corosync process
> This will create a core file in /var/lib/corosync
> 
> The core file will tell us what is happening internally to create this locking
> problem.
> 
> Thanks!
> -steve

Working on getting the core file now. Thanks!
Comment 5 Ricky Nelson 2011-06-01 16:43:06 EDT
Created attachment 502378 [details]
gzipped core for corosync during moment of high CPU usage
Comment 6 Steven Dake 2011-06-03 03:24:10 EDT
core backtrace:

Core was generated by `corosync -f -p'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003ec12e338a in mmap64 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 libibverbs-1.1.4-2.el6.x86_64 librdmacm-1.0.10-2.el6.x86_64 libxml2-2.7.6-1.el6.x86_64 nspr-4.8.6-1.el6.x86_64 nss-3.12.8-3.el6_0.x86_64 nss-util-3.12.8-1.el6_0.x86_64 zlib-1.2.3-25.el6.x86_64
(gdb) where
#0  0x0000003ec12e338a in mmap64 () from /lib64/libc.so.6
#1  0x0000003ec2601977 in memory_map (path=<value optimized out>, 
    bytes=1048576, buf=0xdffa68) at coroipcs.c:258
#2  0x0000003ec26037e2 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xdff9c0) at coroipcs.c:1655
#3  0x0000003ec1e06dba in poll_run (handle=7818508918895673344)
    at coropoll.c:435
#4  0x0000000000406c1e in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=<value optimized out>) at main.c:1792
(gdb) thread apply all bt

Thread 14 (Thread 0x7f0d5e2b3440 (LWP 30301)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdc3890)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdc3890) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x7f0d5d620440 (LWP 1912)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdd4610)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdd4610) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 12 (Thread 0x7f0d5de82440 (LWP 30364)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdbf530)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdbf530) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x7f0d5c989440 (LWP 2228)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xddccd0)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xddccd0) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x7f0d5e6e4440 (LWP 30323)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdc7bf0)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdc7bf0) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 9 (Thread 0x7f0d5da51440 (LWP 1763)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdcbf50)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdcbf50) at coroipcs.c:684
---Type <return> to continue, or q <return> to quit---
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7f0d64036440 (LWP 30228)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdbb1d0)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdbb1d0) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7f0d5d1ef440 (LWP 1909)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdd02b0)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdd02b0) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7f0d64098440 (LWP 30195)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdb2fd0)
    at ../include/corosync/coroipc_ipc.h:199
---Type <return> to continue, or q <return> to quit---
#2  pthread_ipc_consumer (conn=0xdb2fd0) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f0d64067440 (LWP 30227)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdaec70)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdaec70) at coroipcs.c:684
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f0d655b7700 (LWP 30108)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000000000405d60 in corosync_exit_thread_handler (arg=0x0) at main.c:198
#2  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#3  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f0d5cdbc440 (LWP 2224)):
#0  0x0000003ec160d740 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003ec2602618 in ipc_sem_wait (conn=0xdd8970)
    at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0xdd8970) at coroipcs.c:684
---Type <return> to continue, or q <return> to quit---h 
#3  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f0d5fffeda0 (LWP 30107)):
#0  0x0000003ec12dd1e3 in poll () from /lib64/libc.so.6
#1  0x00000000004097fe in prioritized_timer_thread (data=<value optimized out>)
    at timer.c:127
#2  0x0000003ec16077e1 in start_thread () from /lib64/libpthread.so.0
#3  0x0000003ec12e68ed in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f0d6702e700 (LWP 30105)):
#0  0x0000003ec12e338a in mmap64 () from /lib64/libc.so.6
#1  0x0000003ec2601977 in memory_map (path=<value optimized out>, 
    bytes=1048576, buf=0xdffa68) at coroipcs.c:258
#2  0x0000003ec26037e2 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xdff9c0) at coroipcs.c:1655
#3  0x0000003ec1e06dba in poll_run (handle=7818508918895673344)
    at coropoll.c:435
#4  0x0000000000406c1e in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=<value optimized out>) at main.c:1792
(gdb)
Comment 7 Steven Dake 2011-06-03 03:25:56 EDT
We today fixed a problem in memory_map function which is in this code path and could result in the OS doing some odd things with memory map calls.  Since we can't reproduce this problem on our hardware, we could provide a scratch build if the customer is interested in trying a resolution.
Comment 8 Jan Friesse 2011-06-03 04:05:59 EDT
Test build on http://honzaf.fedorapeople.org/mmap/

Ricky, can you please test "test build" and let me know if it helps or not?

If not, please try to get coredump again.
Comment 9 Ricky Nelson 2011-06-03 14:42:31 EDT
Created attachment 502883 [details]
core for corosync during moment of high CPU usage using test build
Comment 10 Ricky Nelson 2011-06-03 14:43:58 EDT
Still received high CPU usage randomly on the nodes. The latest core dump for corosync is attached.
Comment 12 Steven Dake 2011-06-05 18:27:52 EDT
Ricky,

Regarding comment #10, still receiving high cpu usage with test build in comment #8?

Thanks
-steve
Comment 13 Ricky Nelson 2011-06-05 18:33:58 EDT
(In reply to comment #12)
> Ricky,
> 
> Regarding comment #10, still receiving high cpu usage with test build in
> comment #8?
> 
> Thanks
> -steve

Yes, they're still receiving the high CPU usage for the corosync process with the testbuild. That's correct.
Comment 14 Steven Dake 2011-06-05 19:28:18 EDT
Ricky,

Have customer run corosync-blackbox.  Also another core dump would be helpful.

Thanks
Comment 15 Jan Friesse 2011-06-06 03:24:53 EDT
Decoded coredump:
Program terminated with signal 11, Segmentation fault.
#0  0x0000003cdaedd1e3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
87        int result = INLINE_SYSCALL (poll, 3, CHECK_N (fds, nfds), nfds, timeout);
Missing separate debuginfos, use: debuginfo-install libgcc-4.4.5-6.el6.x86_64
(gdb) thread apply all bt

Thread 16 (Thread 0x7eff6c0e1440 (LWP 3138)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x1683e40) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x1683e40) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff6c0e1440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 15 (Thread 0x7eff65f53440 (LWP 7381)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16b3ae0) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16b3ae0) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff65f53440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 14 (Thread 0x7eff66be6440 (LWP 3212)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x1693c20) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x1693c20) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff66be6440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 13 (Thread 0x7eff6c0b0440 (LWP 3137)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x168b560) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x168b560) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff6c0b0440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 12 (Thread 0x7eff6c04e440 (LWP 3272)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x1697f80) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x1697f80) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff6c04e440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 11 (Thread 0x7eff667b5440 (LWP 3233)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x169c2e0) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x169c2e0) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff667b5440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 10 (Thread 0x7eff64e8f440 (LWP 7540)):
---Type <return> to continue, or q <return> to quit---
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16b7e40) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16b7e40) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff64e8f440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 9 (Thread 0x7eff652f1440 (LWP 7107)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16a8d00) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16a8d00) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff652f1440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 8 (Thread 0x7eff66384440 (LWP 7378)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16ad060) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16ad060) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff66384440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 7 (Thread 0x7eff6df01700 (LWP 3018)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x0000000000405d60 in corosync_exit_thread_handler (arg=0x0) at main.c:198
#2  0x0000003cdb2077e1 in start_thread (arg=0x7eff6df01700) at pthread_create.c:301
#3  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 6 (Thread 0x7eff652c0440 (LWP 7104)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16a49a0) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16a49a0) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff652c0440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 5 (Thread 0x7eff64a5e440 (LWP 7543)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16bc1a0) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16bc1a0) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff64a5e440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 4 (Thread 0x7eff65b22440 (LWP 7027)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x16a0640) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x16a0640) at coroipcs.c:674
---Type <return> to continue, or q <return> to quit---
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff65b22440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 3 (Thread 0x7eff6c07f440 (LWP 3106)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007eff6f6ff5d8 in ipc_sem_wait (conn=0x168f8c0) at ../include/corosync/coroipc_ipc.h:199
#2  pthread_ipc_consumer (conn=0x168f8c0) at coroipcs.c:674
#3  0x0000003cdb2077e1 in start_thread (arg=0x7eff6c07f440) at pthread_create.c:301
#4  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7eff67ffeda0 (LWP 3017)):
#0  0x0000003cdaedd1e3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000004097fe in prioritized_timer_thread (data=<value optimized out>) at timer.c:127
#2  0x0000003cdb2077e1 in start_thread (arg=0x7eff67ffeda0) at pthread_create.c:301
#3  0x0000003cdaee68ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7eff6f6f5700 (LWP 3015)):
#0  0x0000003cdaedd1e3 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007eff6fb12d31 in poll_run (handle=7818508918895673344) at coropoll.c:413
#2  0x0000000000406c1e in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at main.c:1792
Comment 16 Jan Friesse 2011-06-06 03:26:42 EDT
It can be seen that sending 11 signal will stop processing of CPU intensive part.

Ricky,
as Steve wrote, can you please send corosync-blackbox output and maybe strace?
Comment 20 Ricky Nelson 2011-06-06 14:12:47 EDT
Created attachment 503297 [details]
corosync-blackbox during high CPU usage of corosync
Comment 21 Ricky Nelson 2011-06-06 14:14:27 EDT
Created attachment 503299 [details]
corosync-fplay during high CPU usage of corosync
Comment 22 Ricky Nelson 2011-06-06 14:15:53 EDT
Created attachment 503300 [details]
corosync-blackbox from separate node during high CPU usage of corosync
Comment 23 Ricky Nelson 2011-06-06 14:18:11 EDT
Attached 'corosync-blackbox' and 'corosync-fplay' from the latest high CPU usage event. Grabbed an strace, but used follow forks during the event which created 1200+ strace files. Getting another without following forks.
Comment 24 Steven Dake 2011-06-06 14:30:51 EDT
corosync-blackbox during low cpu utilization under same loading conditions required.

Thanks
Comment 25 Steven Dake 2011-06-06 14:34:51 EDT
is customer using cmirror?
Comment 26 Steven Dake 2011-06-06 14:45:56 EDT
34920 confdb_initialize followed by confdb_finalize calls in corosync consumer applications.  This is out of norm.
Comment 27 Steven Dake 2011-06-06 14:48:24 EDT
cluster team lead has requested info if there are other processes also at very high cpu utilization during this problem.
Comment 28 Steven Dake 2011-06-06 14:55:59 EDT
is customer doing something related to clvmd at the time this failure occurs?

Regards
-steve
Comment 29 Ricky Nelson 2011-06-06 15:06:30 EDT
(In reply to comment #25)
> is customer using cmirror?

Customer is not using cmirror. I don't see reference to this within the 'ps' output or the cmirror package installed on the systems.

(In reply to comment #27)
> cluster team lead has requested info if there are other processes also at very
> high cpu utilization during this problem.

From a sample 'top' output during a time when the corosync process was using a low amount of CPU:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16595 root      30  10  347m  55m 7360 S 43.1  0.5   0:01.47 sosreport
 2208 root      RT   0  996m  99m  67m R 31.9  0.8  11:01.98 corosync
 2433 dbus      20   0 23596 1132  772 S 13.1  0.0   0:00.11 dbus-daemon
 2633 haldaemo  20   0 27532 4424 3232 S  9.4  0.0   0:00.45 hald
 2722 nscd      20   0  845m 1516 1028 S  9.4  0.0   0:01.81 nscd
18801 root      19  -1 39632 1632  828 S  7.5  0.0   0:00.04 clulog
 5447 topuser   20   0 13716 1524 1040 S  3.7  0.0   6:15.77 htop
18546 rossnick  20   0 13384 1440  804 R  3.7  0.0   0:00.03 top
18548 rossnick  20   0  105m 1856 1452 S  3.7  0.0   0:00.02 bash
18703 root      19  -1  9664 1644 1148 S  3.7  0.0   0:00.02 ip.sh
   47 root      RT   0     0    0    0 S  1.9  0.0   0:00.21 migration/11
 2018 root      20   0  242m 1488  948 S  1.9  0.0   0:01.49 rsyslogd
 2460 root      20   0     0    0    0 S  1.9  0.0   0:51.47 dlm_recv/12
 2464 root      20   0     0    0    0 S  1.9  0.0   0:32.58 dlm_send

We can obtain a 'top' output again during a high CPU usage, but the customer is not seeing any other processes under high CPU.

(In reply to comment #28)
> is customer doing something related to clvmd at the time this failure occurs?
> 
> Regards
> -steve

Services are running on the nodes during the time where corosync spikes CPU usage. Are you looking to see if there is any I/O going to shared storage?
Comment 30 Steven Dake 2011-06-06 15:15:24 EDT
run only corosync-blackbox during non-spike case but when cluster is under its "normal" load and attach output.  Running with sosreport isn't helpful since it skews the results.

We have some speculation that clvmd operations are being done during the high cpu utilization problem.  clvmd calls confdb_initialize each time a remote operation is done.  There could be thousands of these operations.  If the customer does something such as change their clvmd configuration, this could occur.  One thing that may be helpful here is a clvmd version # and core file from clvmd as well.
Comment 31 Ricky Nelson 2011-06-06 15:22:38 EDT
(In reply to comment #30)
> run only corosync-blackbox during non-spike case but when cluster is under its
> "normal" load and attach output.  Running with sosreport isn't helpful since it
> skews the results.
> 
The top output mentioned in comment 29 was from an earlier event. The corosync-blackbox is from an event when nothing other than the services on the box was running. An sosreport was not running during the time of that capture. I was giving you the only 'top' output that we currently have.

> We have some speculation that clvmd operations are being done during the high
> cpu utilization problem.  clvmd calls confdb_initialize each time a remote
> operation is done.  There could be thousands of these operations.  If the
> customer does something such as change their clvmd configuration, this could
> occur.  One thing that may be helpful here is a clvmd version # and core file
> from clvmd as well.

We're still waiting on the corosync-blackbox from a 'non-spike' issue, but we can get a core from clvmd as well.
Comment 33 Steven Dake 2011-06-06 23:04:59 EDT
Honza,

reproducer that possibly fits the customer symptoms:
as root:
run corosync
run attached testa first (this creates a confdb key)
make sure test is in the working directory (this writes a confdb key)
run attached testall (this runs 40 copies of test at the same time)
run corosync-objctl -t test
this runs fine
in another window
run corosync-objctl -t test

this results in deadlock and a nonresponsive corosync process

not a regression from rhel6, rhel6 exhibits same behavior.
Comment 34 Steven Dake 2011-06-06 23:05:50 EDT
Created attachment 503363 [details]
test file
Comment 35 Steven Dake 2011-06-06 23:06:28 EDT
Created attachment 503364 [details]
testall file
Comment 36 Steven Dake 2011-06-06 23:07:07 EDT
Created attachment 503365 [details]
testa file
Comment 37 Jan Friesse 2011-06-07 05:12:28 EDT
Steve,
you found bug for sure, but I don't think it's bug which customer is talking about. BT is clear:
Thread 3 (Thread 0x7fdd25dcd440 (LWP 28867)):
#0  0x00007fdd4fb9150d in write () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fdd477f0c4b in ipc_dispatch_send_from_poll_thread (conn=0x1a886a0, msg=0x7fdd25dcc6b0, mlen=840) at confdb.c:853
#2  0x00007fdd477f0f3e in confdb_notify_lib_of_key_change (change_type=<value optimized out>,
    parent_object_handle=<value optimized out>, object_handle=<value optimized out>, object_name_pt=<value optimized out>,
    object_name_len=<value optimized out>, key_name_pt=0x7fdd466df020, key_name_len=3, key_value_pt=0x7fdd466df230,
    key_value_len=3, priv_data_pt=0x1a886a0) at confdb.c:892
#3  0x00007fdd4d61b864 in object_key_changed_notification (object_handle=8402233951250284587,
    key_name=<value optimized out>, key_len=<value optimized out>, new_value=<value optimized out>,
    new_value_len=<value optimized out>) at objdb.c:300
#4  object_key_replace (object_handle=8402233951250284587, key_name=<value optimized out>, key_len=<value optimized out>,
---Type <return> to continue, or q <return> to quit---
    new_value=<value optimized out>, new_value_len=<value optimized out>) at objdb.c:1337
#5  0x00007fdd477f0230 in message_handler_req_lib_confdb_key_replace (conn=0x1c07400, message=<value optimized out>)
    at confdb.c:556
#6  0x00007fdd4ffaa7dd in pthread_ipc_consumer (conn=0x1c07400) at coroipcs.c:730
#7  0x00007fdd4fb8a7e1 in start_thread (arg=0x7fdd25dcd440) at pthread_create.c:301
#8  0x00007fdd4f6d48ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7fdd446e1440 (LWP 28868)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fdd4fb8c360 in _L_lock_908 () from /lib64/libpthread.so.0
#2  0x00007fdd4fb8c247 in __pthread_mutex_lock (mutex=0x7fdd4d81f5e0) at pthread_mutex_lock.c:82
#3  0x00007fdd4d61d890 in objdb_lock (object_handle=2860872321386676269, key_name=0x7fdd4ffac097, key_len=8,
    value=0x7fdd446e0aec) at objdb.c:113
#4  object_key_increment (object_handle=2860872321386676269, key_name=0x7fdd4ffac097, key_len=8, value=0x7fdd446e0aec)
    at objdb.c:1027
#5  0x0000000000404d97 in corosync_stats_increment_value (handle=2860872321386676269, name=0x7fdd4ffac097 "requests")
    at main.c:1169
#6  0x00007fdd4ffaa7b7 in pthread_ipc_consumer (conn=0x1c0f300) at coroipcs.c:728
#7  0x00007fdd4fb8a7e1 in start_thread (arg=0x7fdd446e1440) at pthread_create.c:301
#8  0x00007fdd4f6d48ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7fdd507f2700 (LWP 6523)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fdd4fb8c360 in _L_lock_908 () from /lib64/libpthread.so.0
#2  0x00007fdd4fb8c247 in __pthread_mutex_lock (mutex=0x7fdd4d81f5e0) at pthread_mutex_lock.c:82
#3  0x00007fdd4d61d890 in objdb_lock (object_handle=7648030173584949282, key_name=0x4105eb, key_len=6,
    value=0x7fff90196bf8) at objdb.c:113
#4  object_key_increment (object_handle=7648030173584949282, key_name=0x4105eb, key_len=6, value=0x7fff90196bf8)
    at objdb.c:1027
#5  0x0000000000404e68 in corosync_stats_create_connection (name=0x7fff90196e40 "corosync-objctl:CONFDB:28790:18",
    pid=28790, fd=<value optimized out>) at main.c:1073
#6  0x00007fdd4ffaa234 in coroipcs_init_conn_stats (conn=0x1bd58d0) at coroipcs.c:1578
#7  0x00007fdd4ffab971 in coroipcs_handler_dispatch (fd=<value optimized out>, revent=1, context=0x1bd58d0)
    at coroipcs.c:1704
#8  0x00007fdd503bcdba in poll_run (handle=1265737887312248832) at coropoll.c:435
#9  0x0000000000406c1e in main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>)
    at main.c:1792

It can be seen that,
- Thread 3 is waiting for finish write operation. This is blocking because pipe is full and waiting for flush
- flush is normally done in objdb_notify_dispatch, called by main.c, but this call will never appear because main thread is waiting for lock. Of course this lock will never gain because it's locked by thread 3

-> DEADLOCK

This pipe things are introduced by 75087f7c1bbde12f255494ab5a96e2389b04643b by Angus. I'm not entirely sure if that patch is really good so CC Angus.

Back to customer problem. I think it's simple DOS. It's very sad that our own software is doing DOS to our own software, but only one real solution seems to me to limit number of new connections per second to (configurable) 1000, to prevent coroipcs_handler_dispatch do too much CPU intensive work and don't have time to handle incoming network messages -> new membership creation -> no service processing and this all in loop.
Comment 38 Steven Dake 2011-06-07 09:51:46 EDT
my backtrace from test case (requested by jfriesse)
136	2:	movl	%edx, %eax
Missing separate debuginfos, use: debuginfo-install libgcc-4.5.1-4.fc14.x86_64 nspr-4.8.8-1.fc14.x86_64 nss-3.12.10-1.fc14.x86_64 nss-util-3.12.10-1.fc14.x86_64 zlib-1.2.5-2.fc14.x86_64
(gdb) where
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x000000321aa09125 in _L_lock_880 () from /lib64/libpthread.so.0
#2  0x000000321aa09008 in __pthread_mutex_lock (mutex=0x7fe3fb642c80)
    at pthread_mutex_lock.c:82
#3  0x00007fe3fb43c40f in objdb_lock (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:113
#4  object_key_increment (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:1027
#5  0x0000000000404f78 in corosync_stats_create_connection (
    name=0x7fff09dfe280 "corosync-objctl:CONFDB:16558:20", pid=16558, 
    fd=<value optimized out>) at main.c:1157
#6  0x00007fe3fbe4cd0a in coroipcs_init_conn_stats (conn=0xa26a10)
    at coroipcs.c:1587
#7  0x00007fe3fbe4eb85 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xa26a10) at coroipcs.c:1715
#8  0x00007fe3fc262813 in poll_run (handle=1197105576937521152)
    at coropoll.c:514
#9  0x000000000040780d in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=0x7fff09dffa90) at main.c:1891
(gdb) 
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x000000321aa09125 in _L_lock_880 () from /lib64/libpthread.so.0
#2  0x000000321aa09008 in __pthread_mutex_lock (mutex=0x7fe3fb642c80)
    at pthread_mutex_lock.c:82
#3  0x00007fe3fb43c40f in objdb_lock (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:113
#4  object_key_increment (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:1027
#5  0x0000000000404f78 in corosync_stats_create_connection (
    name=0x7fff09dfe280 "corosync-objctl:CONFDB:16558:20", pid=16558, 
    fd=<value optimized out>) at main.c:1157
#6  0x00007fe3fbe4cd0a in coroipcs_init_conn_stats (conn=0xa26a10)
    at coroipcs.c:1587
#7  0x00007fe3fbe4eb85 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xa26a10) at coroipcs.c:1715
#8  0x00007fe3fc262813 in poll_run (handle=1197105576937521152)
    at coropoll.c:514
#9  0x000000000040780d in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=0x7fff09dffa90) at main.c:1891
(gdb) 
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x000000321aa09125 in _L_lock_880 () from /lib64/libpthread.so.0
#2  0x000000321aa09008 in __pthread_mutex_lock (mutex=0x7fe3fb642c80)
    at pthread_mutex_lock.c:82
#3  0x00007fe3fb43c40f in objdb_lock (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:113
#4  object_key_increment (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:1027
#5  0x0000000000404f78 in corosync_stats_create_connection (
    name=0x7fff09dfe280 "corosync-objctl:CONFDB:16558:20", pid=16558, 
    fd=<value optimized out>) at main.c:1157
#6  0x00007fe3fbe4cd0a in coroipcs_init_conn_stats (conn=0xa26a10)
    at coroipcs.c:1587
#7  0x00007fe3fbe4eb85 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xa26a10) at coroipcs.c:1715
#8  0x00007fe3fc262813 in poll_run (handle=1197105576937521152)
    at coropoll.c:514
#9  0x000000000040780d in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=0x7fff09dffa90) at main.c:1891
(gdb) 
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x000000321aa09125 in _L_lock_880 () from /lib64/libpthread.so.0
#2  0x000000321aa09008 in __pthread_mutex_lock (mutex=0x7fe3fb642c80)
    at pthread_mutex_lock.c:82
#3  0x00007fe3fb43c40f in objdb_lock (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:113
#4  object_key_increment (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:1027
#5  0x0000000000404f78 in corosync_stats_create_connection (
    name=0x7fff09dfe280 "corosync-objctl:CONFDB:16558:20", pid=16558, 
    fd=<value optimized out>) at main.c:1157
#6  0x00007fe3fbe4cd0a in coroipcs_init_conn_stats (conn=0xa26a10)
    at coroipcs.c:1587
#7  0x00007fe3fbe4eb85 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xa26a10) at coroipcs.c:1715
#8  0x00007fe3fc262813 in poll_run (handle=1197105576937521152)
    at coropoll.c:514
#9  0x000000000040780d in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=0x7fff09dffa90) at main.c:1891
(gdb) bt thread apply all
No symbol "thread" in current context.
(gdb) th apply all bt
Ambiguous command "th apply all bt": thbreak, thread.
(gdb) thread apply all bt

Thread 7 (Thread 0x7fe3fb236700 (LWP 28437)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007fe3fc05464f in logsys_worker_thread (data=<value optimized out>)
    at logsys.c:711
#2  0x000000321aa06ccb in start_thread (arg=0x7fe3fb236700)
    at pthread_create.c:301
#3  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 6 (Thread 0x7fe3fc4a7da0 (LWP 28438)):
#0  0x000000321aa0bf12 in ?? () from /lib64/libpthread.so.0
#1  0x00000000004095a3 in prioritized_timer_thread (data=<value optimized out>)
    at timer.c:133
#2  0x000000321aa06ccb in start_thread (arg=0x7fe3fc4a7da0)
    at pthread_create.c:301
#3  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 5 (Thread 0x7fe3faa35700 (LWP 28439)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x0000000000406610 in corosync_exit_thread_handler (arg=0x0) at main.c:245
#2  0x000000321aa06ccb in start_thread (arg=0x7fe3faa35700)
---Type <return> to continue, or q <return> to quit---
    at pthread_create.c:301
#3  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 4 (Thread 0x7fe3f8f31440 (LWP 32624)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007fe3fbe4dab8 in ipc_sem_wait (conn=0xa4a830)
    at ../include/corosync/coroipc_ipc.h:200
#2  pthread_ipc_consumer (conn=0xa4a830) at coroipcs.c:682
#3  0x000000321aa06ccb in start_thread (arg=0x7fe3f8f31440)
    at pthread_create.c:301
#4  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 3 (Thread 0x7fe3e31cc440 (LWP 8533)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00007fe3fbe4dab8 in ipc_sem_wait (conn=0xa45550)
    at ../include/corosync/coroipc_ipc.h:200
#2  pthread_ipc_consumer (conn=0xa45550) at coroipcs.c:682
#3  0x000000321aa06ccb in start_thread (arg=0x7fe3e31cc440)
    at pthread_create.c:301
#4  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
---Type <return> to continue, or q <return> to quit---

Thread 2 (Thread 0x7fe3f8065440 (LWP 16637)):
#0  0x000000321aa0df1d in write () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fe3f992e06d in ipc_dispatch_send_from_poll_thread (
    change_type=<value optimized out>, 
    parent_object_handle=<value optimized out>, 
    object_handle=<value optimized out>, object_name_pt=<value optimized out>, 
    object_name_len=<value optimized out>, key_name_pt=0x7fe3f8166020, 
    key_name_len=3, key_value_pt=0x7fe3f8166230, key_value_len=3, 
    priv_data_pt=0xa45550) at confdb.c:853
#2  confdb_notify_lib_of_key_change (change_type=<value optimized out>, 
    parent_object_handle=<value optimized out>, 
    object_handle=<value optimized out>, object_name_pt=<value optimized out>, 
    object_name_len=<value optimized out>, key_name_pt=0x7fe3f8166020, 
    key_name_len=3, key_value_pt=0x7fe3f8166230, key_value_len=3, 
    priv_data_pt=0xa45550) at confdb.c:892
#3  0x00007fe3fb43b091 in object_key_changed_notification (
    object_handle=6969003145531752491, name_pt=0x7fe3f8166020, name_len=3, 
    value_pt=0x7fe3f8166230, value_len=3, type=OBJECT_KEY_REPLACED)
    at objdb.c:300
#4  0x00007fe3fb44197f in object_key_replace (
    object_handle=6969003145531752491, key_name=<value optimized out>, 
    key_len=<value optimized out>, new_value=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    new_value_len=<value optimized out>) at objdb.c:1337
#5  0x00007fe3f992d150 in message_handler_req_lib_confdb_key_replace (
    conn=0xa226f0, message=<value optimized out>) at confdb.c:556
#6  0x00007fe3fbe4dc8a in pthread_ipc_consumer (conn=0xa226f0)
    at coroipcs.c:728
#7  0x000000321aa06ccb in start_thread (arg=0x7fe3f8065440)
    at pthread_create.c:301
#8  0x000000321a6e0c2d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7fe3fbe43720 (LWP 28436)):
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x000000321aa09125 in _L_lock_880 () from /lib64/libpthread.so.0
#2  0x000000321aa09008 in __pthread_mutex_lock (mutex=0x7fe3fb642c80)
    at pthread_mutex_lock.c:82
#3  0x00007fe3fb43c40f in objdb_lock (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:113
#4  object_key_increment (object_handle=7358388239251537954, 
    key_name=0x41089d, key_len=6, value=0x7fff09dfe048) at objdb.c:1027
#5  0x0000000000404f78 in corosync_stats_create_connection (
    name=0x7fff09dfe280 "corosync-objctl:CONFDB:16558:20", pid=16558, 
    fd=<value optimized out>) at main.c:1157
---Type <return> to continue, or q <return> to quit---
#6  0x00007fe3fbe4cd0a in coroipcs_init_conn_stats (conn=0xa26a10)
    at coroipcs.c:1587
#7  0x00007fe3fbe4eb85 in coroipcs_handler_dispatch (fd=<value optimized out>, 
    revent=1, context=0xa26a10) at coroipcs.c:1715
#8  0x00007fe3fc262813 in poll_run (handle=1197105576937521152)
    at coropoll.c:514
#9  0x000000000040780d in main (argc=<value optimized out>, 
    argv=<value optimized out>, envp=0x7fff09dffa90) at main.c:1891
Comment 39 Steven Dake 2011-06-07 09:54:27 EDT
this test case may be a new bug not related to the customer problem.

Engineering needs to know which application is making the confdb_initialize calls.  This information is available via corosync-objctl.  sosreport generates and stores this information, but I don't recall exactly where it puts it in the report.

Ricky can you see if you can find the corosync-objctl output?
Comment 40 Jan Friesse 2011-06-07 11:53:41 EDT
Ricki,
before sending sosreport, can you please install packages from http://honzaf.fedorapeople.org/ct2/. It contains little patch which adds debug information about IPC connection in form like  "coroipcs_init_conn_stats from corosync-objctl:CONFDB:9397:14 at 1307459802.341996" available via corosync-blackbox command, so we will be able to identify timestamp when problem occured, ...

Thanks,
  Honza
Comment 41 Ricky Nelson 2011-06-07 14:46:40 EDT
Created attachment 503555 [details]
blackbox and objctl info captured at last high CPU event with the debug corosync packages
Comment 42 Ricky Nelson 2011-06-07 14:49:04 EDT
The attached debuginfo.tar.gz contains the 'corosync-blackbox' and 'corosync-objctl' output captured during the last high CPU event while running on the debug corosync packages.
Comment 43 Steven Dake 2011-06-07 14:59:46 EDT
[sdake@beast tmp]$ grep clulog blackbox2.txt | wc -l
2214

clulog spamming corosync - each one opens a confdb connection.
first one:
rec=[1199819] Log Message=coroipcs_init_conn_stats from clulog:CONFDB:13959:38 at 1307470428.8745
last one:
rec=[1230271] Log Message=coroipcs_init_conn_stats from clulog:CONFDB:18332:40 at 1307470518.973788

518-428 = exactly 90 seconds of clulog spamming - 24 clulog calls per second = high cpu utilization.

Ricky are there any error logs on the nodes in /var/log/cluster which might be of interest?
Comment 45 Nicolas Ross 2011-06-07 17:04:09 EDT
Hello, sorry to peak-in, I'm "the customer" in this case... No, I didn't notice anything in the log in /var/log/cluster. In corosync.log, at the time I took the debuginfo, I saw :

Jun 07 14:15:04 corosync [CKPT  ] ========== Checkpoint Information ===========
Jun 07 14:15:04 corosync [CKPT  ] global_ckpt_id: 322
Jun 07 14:15:04 corosync [CKPT  ] Checkpoint dlmplock.GizServer (18):
Jun 07 14:15:04 corosync [CKPT  ]    id:       321
Jun 07 14:15:04 corosync [CKPT  ]    sec cnt:  40
Jun 07 14:15:04 corosync [CKPT  ]    ref cnt:  1
Jun 07 14:15:04 corosync [CKPT  ]    unlinked: 0
(...)

Apart from that, there is nothing in corosync.log nor rgmanager.log or even /var/log/messages that could be of interest.
Comment 46 Steven Dake 2011-06-07 17:51:14 EDT
Nicolas,

I know we have asked alot of you, and appreciate your patience on this matter.  It appears that clulog is doing alot of logging.  We are not sure if clulog is the cause of corosync pegging the cpu, or if corosync pegging the cpu causes clulog to be run by rgmanager or some other scripts.  One way to identify causality is to essentially take clulog out of the picture.

The way to do that is to create a shell script where clulog is that contains

#!/bin/bash
/bin/true

on all the nodes.

This will make clulog into a no op operation from the perspective of corosync.  If the cpu still pegs, then the problem is related to corosync.  If the cpu doesn't peg, then rgmanager or some related scripts are being run, perhaps unexpectedly.

Fabio could provide more definitive answers about rgmanager, but usually starts his work day in +7 hours.
Comment 47 Nicolas Ross 2011-06-07 22:08:33 EDT
Here's what I've done: I've replace the /usr/sbin/clulog binary with the bash script you suggested. Cpu still pegs, but I wasn't in front of all node's top to monitor it. I did see one event and that time I still saw verry high system cpu usage, but corosync was lower in the list of process...

I will monitor more closely tomorrow morning (now + 10hours) from my office console and report back.
Comment 48 Fabio Massimo Di Nitto 2011-06-07 22:53:15 EDT
Hi Nicolas,

I know you might have provided those information already in the ticket you filed via customer portal, but I don't have access to it directly.

If either you or GSS can please provide me with sosreports from the nodes (after the CPU has done spinning or collect for me a full tarball of /var/log/cluster + dmesg from each node and cluster.conf that would be good a start.
A very important info is also at what time the CPU started spinning to associate events around.

If possible, it might very helpful to add <logging debug="on"/> to cluster.conf. This can be done at runtime without any restart for any cluster service or daemons (please let me know if you need detailed procedure on how to do it).
Comment 50 Jan Friesse 2011-06-08 02:56:51 EDT
(In reply to comment #43)
> [sdake@beast tmp]$ grep clulog blackbox2.txt | wc -l
> 2214
> 
> clulog spamming corosync - each one opens a confdb connection.
> first one:
> rec=[1199819] Log Message=coroipcs_init_conn_stats from clulog:CONFDB:13959:38
> at 1307470428.8745
> last one:
> rec=[1230271] Log Message=coroipcs_init_conn_stats from clulog:CONFDB:18332:40
> at 1307470518.973788
> 
> 518-428 = exactly 90 seconds of clulog spamming - 24 clulog calls per second =
> high cpu utilization.
> 
> Ricky are there any error logs on the nodes in /var/log/cluster which might be
> of interest?

Steve,
I don't think that 24 clulog call per second should be problem. What is REAL problem is PEAK.

grep 1307470458 blackbox2.txt |wc -l
-> 113

and start is in 1307470458.726693, so it means
113 calls in 0.3 sec.

grep 1307470459 blackbox2.txt |wc -l
-> 409

409 conn in one sec.

Fabbio, are you really sure that this correct behavior of resource agents?

Anyway,
real solution for me seems to be rate limiting.
Comment 51 Fabio Massimo Di Nitto 2011-06-08 03:09:51 EDT
(In reply to comment #50)

> 
> Fabbio, are you really sure that this correct behavior of resource agents?

I can´t be sure till I see all the logs. It can be a complex combination of 20000 resources calling clulog at the same time or just one going wild. It might be clulog is failing because corosync is already spinning. Can´t say for sure.
Comment 52 Nicolas Ross 2011-06-08 06:36:55 EDT
Fabio, So you nee new sosreports + tarball of /var/log/cluster + dmesg on each node ? Will the sosreports I've made 2 weeks ago be ok or new ones would be needed ? I'll package this up when I'll be at my office (now + 2 h)
Comment 53 Fabio Massimo Di Nitto 2011-06-08 06:48:57 EDT
(In reply to comment #52)
> Fabio, So you nee new sosreports + tarball of /var/log/cluster + dmesg on each
> node ? Will the sosreports I've made 2 weeks ago be ok or new ones would be
> needed ? I'll package this up when I'll be at my office (now + 2 h)

Hi Nicolas,

i just received the old sosreports from GSS. I´ll let you know if I need new ones.

Thanks
Fabio
Comment 54 Nicolas Ross 2011-06-08 08:55:34 EDT
It appears that the clulog script did helped. I changed it about 14 hours ago, and I saw far less spikes in my cpu graphs. But I cannot tell for sure that those spikes are done by corosync.

And since the last hour, I didn't notice any on on my htop screens. I will revert back to the proper clulog binary to re-enable the log...
Comment 55 Nicolas Ross 2011-06-08 09:03:46 EDT
Spoke too soon ;-(

I saw a spike in cpu on one of the node. But it was not corosync that was on the top of my htop window, it was httpd and mysqld processes, and they were not eating up that much cpu, but system cpu was up to 60 to 70% (average over all cores) for no apparent reason. During that time, my ssh session was more responsive then when corosync was eating cpu.
Comment 56 Fabio Massimo Di Nitto 2011-06-08 09:22:01 EDT
Nicolas,

i cannot spot the root cause of the problem by only looking at the old sosreports.

This is how I suggest we proceed:

1) put the original clulog binary in place
2) enable <logging debug="on"/> in cluster.conf
3) run a  script that will collect the output of "ps aux" every second on every node (this file will get gigantic, make sure there is enough space on disk)
4) monitor the CPU usage on each node
5) stop collecting ps aux after a spike
6) collect sosreports from all nodes

hopefully this should be enough to see what´s happening.

We might have to adjust the ps aux sample frequency. If you are in front of the terminal, collect it on the "crazy" node, while it´s happening a few times running would also be good.

Fabio

PS If you need any assistance to create such scripts, please let me know
Comment 57 Nicolas Ross 2011-06-08 09:27:58 EDT
Do I put back the original corosync from the distro ?
Comment 58 Fabio Massimo Di Nitto 2011-06-08 09:39:11 EDT
(In reply to comment #57)
> Do I put back the original corosync from the distro ?

Yes please, that would be preferable.
Comment 59 Nicolas Ross 2011-06-08 10:30:58 EDT
Ok, I reverted back to the distro's corosync and corosynclibs on the 3 nodes I installed it on. I started collecting ps outputs (with date) and a few minutes later I saw a spike on one node (node104 at 10:13:48 EDT), and a while waiting it to finish, I saw another one (node105 ad 10:14:28 EDT).

Note that while the cpu was high on node104, It was a little bit less in corosync that usual, and that the spike on node105 was less hard.

I will send you an email with the location of the data, it's 86 megs compressed.
Comment 60 Fabio Massimo Di Nitto 2011-06-08 10:40:09 EDT
Hi Nicolas, thanks a lot for the help.
Comment 61 Steven Dake 2011-06-08 15:17:44 EDT
Created attachment 503767 [details]
unsupported test build which may fix problem
Comment 62 Steven Dake 2011-06-08 15:22:20 EDT
Nicolas,

See attachment in #61.  If your up for experimenting a bit more, I believe this should resolve the problem.  Please restore your clulog file to its original settings.  Make sure to stop all nodes in the cluster, then start them from original after installing this tarball.

If this does resolve the issue as I expect, this is not an officially supported build.  For that, you need a hotfix, which Ricky can make happen once we have determined this resolves the issue.

Thanks
-steve
Comment 63 Nicolas Ross 2011-06-08 20:47:05 EDT
Why should I stop all nodes before starting it with this fix ? Can I install it on 3 notes at first and restart only those ?
Comment 64 Steven Dake 2011-06-08 21:36:14 EDT
yes if you want to limit to 3 nodes that should be fine, but keep in mind the other nodes you dont update will still consume cpu outside of what we would like to see.
Comment 65 Fabio Massimo Di Nitto 2011-06-09 03:57:55 EDT
Nicolas, Steven

I think there is a small misunderstanding here.

In order for us to make 100% sure the fix is correct, we need all 8 nodes to be upgraded with the new package.

Nicolas, there is no need to stop all nodes at the same time to perform this upgrade as long as all nodes are upgraded at the end of the process.

Nicolas, you can follow the least intrusive procedure here:

1) pick a node
2) stop all cluster services (including cman) on that node
3) install test package
4) start all cluster services
5) repeat from 1) for all nodes
Comment 66 Jose Castillo 2011-06-09 04:44:50 EDT
Hello Nicolas, 

to complement Fabio's instructions in comment #65, take a look to the following article from the knowledge base:

Red Hat Enterprise Linux High Availability/Cluster Rolling Upgrades
https://access.redhat.com/kb/docs/DOC-46622
Comment 67 Nicolas Ross 2011-06-09 07:12:17 EDT
Thanks, I wll install the packages this morning. I know it's unsupported, but is it safe ? Is there any risk for the data on the gfs volumes ?
Comment 68 Jan Friesse 2011-06-09 07:24:53 EDT
(In reply to comment #67)
> Thanks, I wll install the packages this morning. I know it's unsupported, but
> is it safe ? Is there any risk for the data on the gfs volumes ?

Nicolas,
only change is to not use spinlock but mutex for synchronization, so I can say, it's safe. Officially of course, there is risk, but from practical point of view, no new risk at all.
Comment 69 Nicolas Ross 2011-06-09 09:53:41 EDT
I took the advantage of this to update all other packages (including kernel)... I have not yet finished updating all the nodes (3 are done), and after rebooting the 3rd one, it had a spike in cpu due to corosync. That node didn't had any services runing.

I will continue the rounds of update on all 8 nodes and will report back later.
Comment 70 Nicolas Ross 2011-06-09 11:30:38 EDT
Sorry, didn't help.

I was hit with a cpu spike on one node, and then a few seconds later on 2 others.
Comment 71 Fabio Massimo Di Nitto 2011-06-09 11:42:08 EDT
Nicolas, just to be clear, are we talking about generic CPU spike or corosync spikes? some of your previous messages were not totally clear to me.
Comment 72 Nicolas Ross 2011-06-09 12:22:07 EDT
Corosync spikes. In one case, it got up to 600%. I've grabed an htop window : http://router.mq.gizmolink.com/~rossnick/htop2.gif
Comment 73 Steven Dake 2011-06-09 12:49:28 EDT
In that screenshot htop is showing corosync at 86% cpu.  This is normal under plock load even with updated packages.  Where do you get 600% from?  htop shows each thread of corosync separately in its list - so if it were spiking, it would show each thread of corosync separately (6 threads at 100%).

The important question is not whether corosync uses cpu, but if the original reported problem is fixed:
"During that time, the node is unresponsive. Services
running on that node stales. Corosync.log shows nothing. After 60 to 120
seconds, the CPU usage dies down and the node continues to function normally."

Do you still find the node unresponsive with these new tarballs?

Regards
-steve
Comment 74 Nicolas Ross 2011-06-09 13:18:22 EDT
I did not get a snapshot while corosync was at 600%...

I grabed another one :
http://router.mq.gizmolink.com/~rossnick/htop3.gif hete at 330%.

And on that node, I curently have nothing that eats cpu. In fact, at this time, nothing on the cluster should eat more than the equivalent of one core.

Keep in mind, that from my understanding of top, htop and ps, cpu usage reported in thos does not include system cpu not.

So, no, during that time, ssh shells are still unusable, but services apears to responds correctly.
Comment 85 Jan Friesse 2011-06-14 03:39:12 EDT
Created attachment 504615 [details]
Stupid threaded prime number detector
Comment 99 Jan Friesse 2011-06-29 10:23:18 EDT
Created attachment 510475 [details]
Final patch which removes spinlocks from corosync

Patch was sent to ML, reviewed and committed to upstream as 8c717c22b2f137f81ac5ae1a3437d73b62bb451d
Comment 103 Steven Dake 2011-07-22 12:27:56 EDT
*** Bug 675568 has been marked as a duplicate of this bug. ***
Comment 105 Jaroslav Kortus 2011-09-08 09:43:25 EDT
marking verified.
Comment 106 Steven Dake 2011-10-27 14:42:48 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: A spinlock was used around I/O operations.
  Consequence: Corosync would consume extremely high CPU under heavy use of IPC operations because the spinlocks would spin during I/O.
  Fix: Changed spinlock to a mutex.
  Result: The mutex releases the processor from spinning but enforces correct behavior.
Comment 108 errata-xmlrpc 2011-12-06 06:50:53 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1515.html

Note You need to log in before you can comment on or make changes to this bug.