Bug 730289
Summary: | clvmd not working properly if multiple clients talk to clvmd (part1) | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Milan Broz <mbroz> | |
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | |
Status: | CLOSED ERRATA | QA Contact: | Corey Marthaler <cmarthal> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | high | |||
Version: | 6.2 | CC: | agk, cmarthal, coughlan, djansa, dwysocha, heinzm, jbrassow, nperic, prajnoha, prockai, pvrabec, thornber, zkabelac | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | lvm2-2.02.98-4.el6 | Doc Type: | Bug Fix | |
Doc Text: |
Clvmd has been consuming a lot of memory resource to process every single request. Each request in current architecture invokes a thread, and by default such thread is allocating ~9MB of RAM for stack. Since clmvd uses mlockall to keep all pages in memory, 200 parallel request easily grown running binary size to 2GB leading either to OOM killer or very slow system.
Fix has reduced the default thread's stack size to 128KB which is enough for current version of lvm to handle all tasks. This leads to massive reduction of memory used runtime by clvmd.
|
Story Points: | --- | |
Clone Of: | ||||
: | 883842 (view as bug list) | Environment: | ||
Last Closed: | 2013-02-21 08:03:11 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 756082 |
Description
Milan Broz
2011-08-12 11:30:01 UTC
I'll take a closer look on this one - but I think there were no memory optimisation for multi-threaded code yet - as that's beyond the capability of simple memory maps locking thus clvmd basically works in 'mlockall()' mode with LANG=C. Though not really sure how large lvm locks are - if you would try to run 200 lvm commands at the same time you would probably exhaust 2GB machine as well - i.e. 1 lvm command ~30MB -> 6GB of memory would need to be mapped at the same time. So there needs to be some max limit of allowed clvmd threads completing same command. Patch to address memory stack thread memory consumption Reduce preallocated stack size to 64KiB: https://www.redhat.com/archives/lvm-devel/2011-October/msg00070.html 128KiB for thread stack size. https://www.redhat.com/archives/lvm-devel/2011-October/msg00072.html Reduces memory usage for overloaded clvmd. It's also forth to note without this patch: https://www.redhat.com/archives/lvm-devel/2011-October/msg00069.html Clvmd coredumps when it gets out of free descriptor during refresh filter operation, so the patch is rather mandatory here. *** Bug 698393 has been marked as a duplicate of this bug. *** Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Clvmd has been consuming a lot of memory resource to process every single request. Each request in current architecture invokes a thread, and by default such thread is allocating ~9MB of RAM for stack. Since clmvd uses mlockall to keep all pages in memory, 200 parallel request easily grown running binary size to 2GB leading either to OOM killer or very slow system. Fix has reduced the default thread's stack size to 128KB which is enough for current version of lvm to handle all tasks. This leads to massive reduction of memory used runtime by clvmd. Tested with: lvm2-2.02.95-5.el6.x86_64 lvm2-cluster-2.02.95-5.el6.x86_64 cmirror-2.02.95-5.el6.x86_64 device-mapper-1.02.74-5.el6.x86_64 device-mapper-persistent-data-0.1.4-1.el6.x86_64 device-mapper-libs-1.02.74-5.el6.x86_64 device-mapper-event-libs-1.02.74-5.el6.x86_64 device-mapper-event-1.02.74-5.el6.x86_64 kernel-2.6.32-265.el6.x86_64 There is no OOM issue and the commands are working normally up to 160 iterations of the below loop. But the lvm locks up if the number of iterations is 170 (or more) for two commands at the same time: for i in {1..170}; do lvs & vgs & done (11:00:42) [root@node01:~]$ jobs | grep "Running" | wc -l 340 The jobs kept sleeping in the background even after 5 minutes. None of the jobs finished and lvm commands did not return anything. The subsequent lvm commands just hang. (In reply to comment #10) > The jobs kept sleeping in the background even after 5 minutes. > > None of the jobs finished and lvm commands did not return anything. > The subsequent lvm commands just hang. Please, try installing the lvm2-debuginfo and then try to use gdb to hook into the process that hung to grab the backtrace... How many LVs/VGs did you use in your test? (07:20:23) [root@node01:~]$ lvs LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lv_root VolGroup -wi-ao-- 8.52g lv_swap VolGroup -wi-ao-- 1008.00m lvol0 my_vg -wi-a--- 252.00m mirrored my_vg mwi-a-m- 500.00m mirrored_mlog 100.00 striped my_vg -wi-a--- 504.00m (07:20:30) [root@node01:~]$ vgs VG #PV #LV #SN Attr VSize VFree VolGroup 1 2 0 wz--n- 9.51g 0 my_vg 4 3 0 wz--nc 5.70g 3.98g (07:20:33) [root@node01:~]$ pvs PV VG Fmt Attr PSize PFree /dev/sdb1 my_vg lvm2 a-- 1.91g 1.66g /dev/sdb2 my_vg lvm2 a-- 1.91g 1.90g /dev/sde1 lvm2 a-- 1.91g 1.91g /dev/sdf1 lvm2 a-- 1.91g 1.91g /dev/sdh1 my_vg lvm2 a-- 968.00m 216.00m /dev/sdh2 my_vg lvm2 a-- 968.00m 216.00m /dev/vda2 VolGroup lvm2 a-- 9.51g 0 Well I am not really sure which one is the first to get stuck, since none of the bg processes actually return anything. So here is a backtrace of the first process: #0 0x00007f08c8235360 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:82 #1 0x000000000048ebb3 in read (inbuf=0x7fff81ed4260 "-", inlen=<value optimized out>, retbuf=0x7fff81ed42b0) at /usr/include/bits/unistd.h:45 #2 _send_request (inbuf=0x7fff81ed4260 "-", inlen=<value optimized out>, retbuf=0x7fff81ed42b0) at locking/cluster_locking.c:119 #3 0x000000000048f094 in _cluster_request (clvmd_cmd=<value optimized out>, node=0x4c922b "", data=0x7fff81ed4310, len=14, response=0x7fff81ed4358, num=0x7fff81ed4364) at locking/cluster_locking.c:223 #4 0x000000000048f60b in _lock_for_cluster (cmd=0x1ffd110, clvmd_cmd=45 '-', flags=<value optimized out>, name=<value optimized out>) at locking/cluster_locking.c:376 #5 0x000000000048fb66 in _lock_resource (cmd=0x1ffd110, resource=<value optimized out>, flags=256) at locking/cluster_locking.c:424 #6 0x000000000045c054 in _lock_vol (cmd=0x1ffd110, resource=<value optimized out>, flags=256, lv_op=LV_NOOP) at locking/locking.c:387 #7 0x000000000045cbce in lock_vol (cmd=0x1ffd110, vol=0x4b97c4 "#sync_names", flags=256) at locking/locking.c:465 #8 0x000000000047f91d in free_cmd_vgs (cmd_vgs=0x7fff81ed5780) at metadata/replicator_manip.c:594 #9 0x000000000042f800 in process_each_lv (cmd=<value optimized out>, argc=1, argv=<value optimized out>, flags=0, handle=0x2047280, process_single_lv=0x42cbb0 <_lvs_single>) at toollib.c:392 #10 0x000000000042c43b in _report (cmd=0x1ffd110, argc=0, argv=0x7fff81ed5b90, report_type=LVS) at reporter.c:421 #11 0x000000000041fd52 in lvm_run_command (cmd=0x1ffd110, argc=0, argv=0x7fff81ed5b90) at lvmcmdline.c:1099 #12 0x0000000000422712 in lvm2_main (argc=1, argv=0x7fff81ed5b88) at lvmcmdline.c:1468 #13 0x00007f08c8179cdd in __libc_start_main (main=0x438380 <main>, argc=1, ubp_av=0x7fff81ed5b88, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fff81ed5b78) at libc-start.c:226 If you need more backtrace lines let me know, I can put it onto some pastebin. duh.. There was just additional line which did not fit to my screen :) #14 0x0000000000413c79 in _start () ...so waiting for an answer from clvmd. Could you do the same backtrace + debug from clvmd, seems like it got overloaded, but would be fine to know where exactly in the code (I guess the clvmd debug log will show us). (...just for the record, attaching the clvmd backtrace I got from Nenad) #0 0x00007f95f8be754d in read () at ../sysdeps/unix/syscall-template.S:82 #1 0x000000000040b64b in read (thisfd=0x14da8f0, buf=<value optimized out>, maxlen=<value optimized out>, csid=<value optimized out>, new_client=<value optimized out>) at /usr/include/bits/unistd.h:45 #2 local_pipe_callback (thisfd=0x14da8f0, buf=<value optimized out>, maxlen=<value optimized out>, csid=<value optimized out>, new_client=<value optimized out>) at clvmd.c:710 #3 0x000000000040a59f in main_loop (local_sock=<value optimized out>, cmd_timeout=60) at clvmd.c:900 #4 0x000000000040c3b7 in main (argc=<value optimized out>, argv=<value optimized out>) at clvmd.c:608 That's the first "read" in clvmd.c/local_pipe_callback fn... Using clvmd debug log ended up with no hang, just the error message issued: cluster request failed: Device or resource busy Can't get lock for my_vg Skipping volume group my_vg (which is OK if the clvmd is under pressure, there was an exact patch for this to issue such error code iirc) Could you please attach full backtrace of all clvmd threads ? (in gdb - "thread apply all bt full") Here's a pastebin which should sit there for a month: http://pastebin.test.redhat.com/86662 There are a lot of threads so I pasted just a few of them. Here's the list of threads when gdb is started (just a list): http://pastebin.test.redhat.com/86663 As for threads themselves: (gdb) info threads 343 Thread 0x7f535501b700 (LWP 2611) 0x00007f5355a1e54d in read () at ../sysdeps/unix/syscall-template.S:82 342 Thread 0x7f5356ca1700 (LWP 2612) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 341 Thread 0x7f5356b9f700 (LWP 2916) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 . . . . 3 Thread 0x7f5335995700 (LWP 3376) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 2 Thread 0x7f5335974700 (LWP 3377) pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 * 1 Thread 0x7f5356ca27a0 (LWP 2597) 0x00007f5355a1e54d in read () at ../sysdeps/unix/syscall-template.S:82 Info about the process which takes 1st thread (2597) process 2597 cmdline = 'clvmd' cwd = '/' exe = '/usr/sbin/clvmd' Process: 2597 Exec file: clvmd State: T Parent process: 1 Process group: 2597 Session id: 2597 TTY: 0 TTY owner process group: -1 Flags: 0x402140 Minor faults (no memory page): 23700 Minor faults, children: 1670 Major faults (memory page faults): 2 Major faults, children: 1 utime: 29 stime: 51 utime, children: 0 stime, children: 2 jiffies remaining in current time slice: 20 'nice' value: 0 jiffies until next timeout: 343 jiffies until next SIGALRM: 0 start time (jiffies since system boot): 15789 Virtual memory size: 637149184 Resident set size: 20459 rlim: 18446744073709551615 Start of text: 0x400000 End of text: 0x49f084 Start of stack: 0x7fffc0f65200 and process of thread 343: (gdb) info proc 2611 stat process 2611 cmdline = 'clvmd' cwd = '/' exe = '/usr/sbin/clvmd' Process: 2611 Exec file: clvmd State: T Parent process: 1 Process group: 2597 Session id: 2597 TTY: 0 TTY owner process group: -1 Flags: 0x402040 Minor faults (no memory page): 23700 Minor faults, children: 1670 Major faults (memory page faults): 2 Major faults, children: 1 utime: 29 stime: 51 utime, children: 0 stime, children: 2 jiffies remaining in current time slice: 20 'nice' value: 0 jiffies until next timeout: 343 jiffies until next SIGALRM: 0 start time (jiffies since system boot): 16103 Virtual memory size: 637149184 Resident set size: 20459 rlim: 18446744073709551615 Start of text: 0x400000 End of text: 0x49f084 Start of stack: 0x7fffc0f65200 (In reply to comment #19) > Here's a pastebin which should sit there for a month: > > http://pastebin.test.redhat.com/86662 This one's interesting, leading to libdlm and probably waiting for a lock (the other threads are in pthread_cond_wait): Thread 343 (Thread 0x7f535501b700 (LWP 2611)): #0 0x00007f5355a1e54d in read () at ../sysdeps/unix/syscall-template.S:82 No locals. #1 0x00007f5355c2ede5 in ?? () from /usr/lib64/libdlm.so.3 No symbol table info available. #2 0x00007f5355c2eeaf in ?? () from /usr/lib64/libdlm.so.3 No symbol table info available. #3 0x00007f5355a17851 in start_thread (arg=0x7f535501b700) at pthread_create.c:301 __res = <value optimized out> pd = 0x7f535501b700 now = <value optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139995885188864, 6297545747566397011, 139995895697920, 139995885189568, 0, 3, -6213900416108231085, -6213899588721998253}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <value optimized out> pagesize_m1 = <value optimized out> sp = <value optimized out> freesize = <value optimized out> #4 0x00007f535576567d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (In reply to comment #19) > Here's a pastebin which should sit there for a month: > > http://pastebin.test.redhat.com/86662 ... > As for threads themselves: > . > . > . > . > 3 Thread 0x7f5335995700 (LWP 3376) pthread_cond_wait@@GLIBC_2.3.2 () > at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 > 2 Thread 0x7f5335974700 (LWP 3377) pthread_cond_wait@@GLIBC_2.3.2 () > at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 > * 1 Thread 0x7f5356ca27a0 (LWP 2597) 0x00007f5355a1e54d in read () > at ../sysdeps/unix/syscall-template.S:82 Nenad, could you please do a full bt of "thread 1" (the one which is also in "read" call). This one is not in the bt log in the pastebin... (In reply to comment #21) > > * 1 Thread 0x7f5356ca27a0 (LWP 2597) 0x00007f5355a1e54d in read () > > at ../sysdeps/unix/syscall-template.S:82 > > Nenad, could you please do a full bt of "thread 1" (the one which is also in > "read" call). This one is not in the bt log in the pastebin... (when I'm trying this on my machine, I'm also getting the "read" in the last thread seems to be, after installing cluster debuginfo: #0 0x000000381e00e4ed in read () from /lib64/libpthread.so.0 #1 0x0000003820801de5 in read (fd=<value optimized out>) at /usr/include/bits/unistd.h:45 #2 do_dlm_dispatch_v6 (fd=<value optimized out>) at /usr/src/debug/cluster-3.0.12/dlm/libdlm/libdlm.c:529 #3 0x0000003820801eaf in dlm_recv_thread (lsinfo=0x18aa190) at /usr/src/debug/cluster-3.0.12/dlm/libdlm/libdlm.c:1180 #4 0x000000381e0077f1 in start_thread () from /lib64/libpthread.so.0 #5 0x000000381d4e5ccd in clone () from /lib64/libc.so.6 So that might be OK. But the "thread 1" is the clvmd's main_loop call and it shouldn't be in the "read" unless it's receiving any data from a client (which shouldn't happen as the client itself is waiting on clvmd :)). So let's see what's the backtrace of your "thread 1" with the read call. (In reply to comment #22) > So let's see what's the backtrace of your "thread 1" with the read call. (ehm, well, that would be comment #16) This is not the same run (I had to restart the cluster to do some other tests) but here is the info of thread 1 now: Thread 1 (Thread 0x7f6d864a87a0 (LWP 2476)): #0 0x00007f6d8522454d in read () at ../sysdeps/unix/syscall-template.S:82 No locals. #1 0x000000000040b64b in read (thisfd=0x1c04bf0, buf=<value optimized out>, maxlen=<value optimized out>, csid=<value optimized out>, new_client=<value optimized out>) at /usr/include/bits/unistd.h:45 No locals. #2 local_pipe_callback (thisfd=0x1c04bf0, buf=<value optimized out>, maxlen=<value optimized out>, csid=<value optimized out>, new_client=<value optimized out>) at clvmd.c:710 len = <value optimized out> buffer = "\000\000\004#sync_names\000\000\000\000\000\000\002\000\000\000\v\000\000\000\001\000I\026\000\000\000\005\000\000\000\000\000\000\000\000\000W\004#sync_names\000JOVGg0JH7yLSiYe6QOO3B4jIx85PyakBH0fYwGXQMqVTwSWpbhr0\000M\030,\r\000\000\000\000\355\060\231O\000\000\000\000M\030,\r\000\000\000\000\355\060\231O\000\000\000\000M\030,\r", '\000' <repeats 36 times>, "P\352\272\001\000\000\000\000\020J\323\\\377\177\000\000\254l\357\204m\177\000\000P\352\272\001\000\000\000\000\350Q\357\204m\177\000\000\000\000\000\000\000\000\000\000P\352\272\001\000\000\000\000\020J\323\\\377\177"... sock_client = 0x1be55d0 status = -1 #3 0x000000000040a59f in main_loop (local_sock=<value optimized out>, cmd_timeout=60) at clvmd.c:900 newfd = 0x0 ret = <value optimized out> lastfd = <value optimized out> csid = "\000\000\000" select_status = <value optimized out> thisfd = <value optimized out> tv = {tv_sec = 59, tv_usec = 999872} quorate = <value optimized out> in = {fds_bits = {580401401977512192, 2486556012189384704, 4803089002626285568, 1120986464257, 1788211223540533416, -3431742916056317951, -1936173725232639768, -6734567919800179253, -8301811925027477502, 887230, 2147463168, 0, 0, 2494994193563254784, -6329773539414375774, 2863311530}} ss = {__val = {16426, 0 <repeats 15 times>}} #4 0x000000000040c3b7 in main (argc=<value optimized out>, argv=<value optimized out>) at clvmd.c:608 local_sock = 6 newfd = <value optimized out> delfd = <value optimized out> lvm_params = {excl_uuid = 0x1074030} opt = <value optimized out> cmd_timeout = 60 start_timeout = 30 cluster_iface = <value optimized out> ss = {__val = {18434, 0 <repeats 15 times>}} debug_opt = DEBUG_OFF debug_arg = DEBUG_OFF clusterwide_opt = 0 old_mask = <value optimized out> ret = 1 longopts = {{name = 0x475f23 "help", has_arg = 0, flag = 0x0, val = 104}, { name = 0x0, has_arg = 0, flag = 0x0, val = 0}} and the last thread in this case is 342: Thread 342 (Thread 0x7f6d84821700 (LWP 2490)): #0 0x00007f6d8522454d in read () at ../sysdeps/unix/syscall-template.S:82 No locals. #1 0x00007f6d85434de5 in read (fd=<value optimized out>) at /usr/include/bits/unistd.h:45 No locals. #2 do_dlm_dispatch_v6 (fd=<value optimized out>) at /usr/src/debug/cluster-3.0.12.1/dlm/libdlm/libdlm.c:529 resultbuf = "\006\000\000\000\000\000\000\000\001\000\000\000H\000\000\000\340\371@\000\000\000\000\000p,\r\203m\177\000\000\310,\r\203m\177\000\000\376\377\376\377\001\000\177\002", '\000' <repeats 55 times> result = 0x7f6d84820d90 status = <value optimized out> astaddr = <value optimized out> #3 0x00007f6d85434eaf in dlm_recv_thread (lsinfo=0x1baea50) at /usr/src/debug/cluster-3.0.12.1/dlm/libdlm/libdlm.c:1180 lsi = 0x1baea50 #4 0x00007f6d8521d851 in start_thread (arg=0x7f6d84821700) at pthread_create.c:301 __res = <value optimized out> pd = 0x7f6d84821700 now = <value optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140108351280896, 5333901287343311301, 140108361789952, 140108351281600, 0, 3, -5413647869769365051, -5413651424327710267}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <value optimized out> pagesize_m1 = <value optimized out> sp = <value optimized out> freesize = <value optimized out> #5 0x00007f6d84f6b67d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 No locals. The block on "read" in the "local_pipe_callback" fn is because there's no data to read, I assume. Strange thing here is that the "local_pipe_callback" where that blocking happens, is called based on previous "select" call in the "main_loop". That's odd. Seems like some uncaught error code somewhere (select?). I'll try to investigate more... The problem is in the way clvmd gets started: if open files limit is set to 1024 when starting clvmd, the clvmd process hits in and gives out a few errors we have noticed in one of the previous comments: cluster request failed: Device or resource busy because it actually reached an open file limit, but completes the run. If however, before starting the clvmd the limit of open files is increased (I increased it to 4096, but I think it would do with any lower number, just bigger than 1024) it gets stuck as reported in my first posts. This gives an easy way to reproduce it. Hope this helps in tracking it down. (In reply to comment #26) > If however, before starting the clvmd the limit of open files is increased (I > increased it to 4096, but I think it would do with any lower number, just > bigger than 1024) it gets stuck as reported in my first posts. > > This gives an easy way to reproduce it. > Hope this helps in tracking it down. Yes, reproduced now. Thanks. I believe this bug has been reasonable resolved. This BZ is not fixed. LVM commands get deadlocked again in the same way. Here's the setup: core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 29479 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 29479 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Packages installed: lvm2-debuginfo-2.02.98-2.el6.x86_64 lvm2-libs-2.02.98-2.el6.x86_64 lvm2-devel-2.02.98-2.el6.x86_64 lvm2-2.02.98-2.el6.x86_64 lvm2-cluster-2.02.98-2.el6.x86_64 device-mapper-event-1.02.77-2.el6.x86_64 device-mapper-event-devel-1.02.77-2.el6.x86_64 device-mapper-1.02.77-2.el6.x86_64 device-mapper-event-libs-1.02.77-2.el6.x86_64 device-mapper-devel-1.02.77-2.el6.x86_64 device-mapper-libs-1.02.77-2.el6.x86_64 kernel-2.6.32-330.el6.x86_64 Executed: for j in $(seq 1 200); do lvs vg_test& vgs& done Output: [1] 6579 [2] 6580 [3] 6581 [4] 6582 [5] 6583 [6] 6584 . . . [399] 7191 [400] 7192 (05:14:09) [root@r6-node01:~]$ LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert onelv vg_test -wi-a---- 500.00m twolv vg_test -wi-a---- 500.00m LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert onelv vg_test -wi-ao--- 500.00m twolv vg_test -wi-ao--- 500.00m LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert onelv vg_test -wi-ao--- 500.00m twolv vg_test -wi-ao--- 500.00m and then it gets stuck. No further LVM commands can be executed. The minimum number of iterations to make it crash, as far as I have seen is 170. and as before ulimit for open files should be greater than 1024. If not, some of the threads would give out exit code 5 and return: " Can't get lock for VolGroup cluster request failed: Device or resource busy " We need look at this again. But too late now, moving to 6.5. The memory size used by clvmd has been reduced, the next problem here is in setting of network socket connections. The given use-case of contacting clvmd by so many users at the same time however is not 'real-life' situation - thus it should not be really a big problem for now. (The memory consumption has been much bigger issue) Moving this bug back to 6.4 - we'll make it into two parts - one for 6.4 with things already resolved and one for 6.5 for things still left to resolve. I've made local tests with over 1000 lvs commands running in parallel - and test passed in 24 seconds on my 4GB laptop. But - there are few things which could influence overall process - So we would need to know the size of memory on the machine where the test is failing. If you want to run that many commands in parallel - you have to count with quite huge 'default' size of preallocated memory for each lvm2 command. The fact I'm able to run 1000 lvm commands on my machine is - because I use lowered boundaries for preallocated memory in lvm.conf. So after all my assumption why is it failing is - because your box runs out of memory in limited memory environment - if you would have couple GB - you would be just fine - since in my test it looks like clvmd itself is happy with around 60MB or real used memory - but since a lot of running lvm commands takes a lot of memory - you are getting the memory problem anyway. We will need to consider, if we will start to watch out for available system resources - but that's out of 6.4. (In reply to comment #38) > We will need to consider, if we will start to watch out for available system > resources - but that's out of 6.4. That can be now tracked by bug #883842. Even though it is maybe not a "normal" use case scenario, still this is what happens and should be noted: If a system is running with open file limit (ulimit -n) greater than 1024, clvmd is not capable of handling more than 340 parallel requests at the time (meaning 170 iterations of two commands for example). What happens is that it just gets stuck after returning from maybe 10 of them, the rest are just sitting and waiting. The system memory has no visible effect on this behavior (tried with 4GB and 2GB)- just the number of open files allowed. At the moment the commands no longer return this was memory usage: (06:11:13) [root@r6-node01:~]$ free total used free shared buffers cached Mem: 3793056 965256 2827800 0 15840 100088 -/+ buffers/cache: 849328 2943728 Swap: 2064376 0 2064376 We'll document the limits in the technical note. Marking this Verified with a condition of making a technical note describing this behavior (ulimit >1024) as stated in Comment #42. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0501.html |