After updating to qpid-dispatch 0.4-13, there memory leak was greatly improved; however, there is still a minor memory increase still being seen over large amounts of installs. This is related to https://bugzilla.redhat.com/show_bug.cgi?id=1312419 that fixed the major link and memory leak.
For QE: reproducer: see https://bugzilla.redhat.com/show_bug.cgi?id=1312419#c0 (the bash script in description with hammer command) and monitor qdrouterd on Sat and/or Caps. Simplified reproducer both for QE and qpid devels: https://bugzilla.redhat.com/show_bug.cgi?id=1312419#c3
Is there 0.4-14 package available (mentioned elsewhere)? Can't find it on brew. Re-verifying 0.4-13: - using standalone python script against standalone qdrouterd+qpidd, no leak at the end (memory utilization stabilizes after 10 minutes, this seems an improvement) - using standalone python script against Satellite's qdrouterd, segfault in proton found - see [1] - IMHO not reproducible using just goferd / just Satellite tools, so filed against Interconnect rather - installing&removing packages via katello-agent and qdrouterd, memory still grows forever [1] https://issues.jboss.org/browse/ENTMQIC-1889
(In reply to Pavel Moravec from comment #7) > Is there 0.4-14 package available (mentioned elsewhere)? Can't find it on > brew. > > Re-verifying 0.4-13: > > - using standalone python script against standalone qdrouterd+qpidd, no leak > at the end (memory utilization stabilizes after 10 minutes, this seems an > improvement) > - using standalone python script against Satellite's qdrouterd, segfault in > proton found - see [1] > - IMHO not reproducible using just goferd / just Satellite tools, so filed > against Interconnect rather > - installing&removing packages via katello-agent and qdrouterd, memory still > grows forever > > [1] https://issues.jboss.org/browse/ENTMQIC-1889 NACK to 0.4-13 (or rather to qpid-proton-c-0.9-12), it causes the segfault even when installing/removing a package via Satellite. So the ENTMQIC-1889 does affect Satellite. Coredump of qdrouterd on Sat - see ENTMQIC-1889. Coredump of qdrouterd on Caps (little bit different but cause is the same): #0 pn_do_transfer (transport=0x1a1f450, frame_type=<optimized out>, channel=<optimized out>, args=0x1a15ba0, payload=0x7feddc2532a0) at /usr/src/debug/qpid-proton-0.9/proton-c/src/transport/transport.c:1303 1303 if (!ssn->state.incoming_window) { (gdb) bt #0 pn_do_transfer (transport=0x1a1f450, frame_type=<optimized out>, channel=<optimized out>, args=0x1a15ba0, payload=0x7feddc2532a0) at /usr/src/debug/qpid-proton-0.9/proton-c/src/transport/transport.c:1303 #1 0x00007fede8a0af7b in pni_dispatch_action (payload=0x7feddc2532a0, args=0x1a15ba0, channel=<optimized out>, frame_type=0 '\000', lcode=<optimized out>, transport=0x1a1f450) at /usr/src/debug/qpid-proton-0.9/proton-c/src/dispatcher/dispatcher.c:74 #2 pni_dispatch_frame (args=0x1a15ba0, transport=0x1a1f450, frame=...) at /usr/src/debug/qpid-proton-0.9/proton-c/src/dispatcher/dispatcher.c:116 #3 pn_dispatcher_input (transport=transport@entry=0x1a1f450, bytes=0x1a37fe0 "", available=0, batch=batch@entry=true, halt=halt@entry=0x1a1f5d2) at /usr/src/debug/qpid-proton-0.9/proton-c/src/dispatcher/dispatcher.c:135 #4 0x00007fede8a12f7c in pn_input_read_amqp (transport=0x1a1f450, layer=<optimized out>, bytes=<optimized out>, available=<optimized out>) at /usr/src/debug/qpid-proton-0.9/proton-c/src/transport/transport.c:1672 #5 0x00007fede8a209f1 in process_input_ssl (transport=0x1a1f450, layer=0, input_data=0x1a2b49b "\210\322\321\a߉\210\327\213\023\364H\235\217m͘5R\002a\373\027\365F\221n\277\244\032S\275\326\301\263\306Y\305#\203k\277\236\260hߋ%\312r'\336}\306i\224&\213ӾW\037\375F\177\211L\254\225\356\027櫠h\372+\200\217\270\032ũ\034\271%\004\233\250\034\343Rk.Q\017~\215\226\347\266j/Wcb1J\253?\002\266\256lG\271\016\332\300@qfy\343\031\234$\026\357\016\304k\002\201BC`_\037\372\207\245\216L\342£\233\231\247\273/τa\356\357\366\001\340+\301l\205$F^\367(}R4v\022U\314\061", available=0) at /usr/src/debug/qpid-proton-0.9/proton-c/src/ssl/openssl.c:934 #6 0x00007fede8a1303a in transport_consume (transport=transport@entry=0x1a1f450) at /usr/src/debug/qpid-proton-0.9/proton-c/src/transport/transport.c:1604 #7 0x00007fede8a14452 in pn_transport_process (transport=transport@entry=0x1a1f450, size=<optimized out>) at /usr/src/debug/qpid-proton-0.9/proton-c/src/transport/transport.c:2690 #8 0x00007fede8c56d73 in qdpn_connector_process (c=c@entry=0x1a13dc0) at /usr/src/debug/qpid-dispatch-0.4/src/posix/driver.c:711 #9 0x00007fede8c60b4c in process_connector (cxtr=0x1a13dc0, qd_server=0x1991350) at /usr/src/debug/qpid-dispatch-0.4/src/server.c:328 #10 thread_run (arg=<optimized out>) at /usr/src/debug/qpid-dispatch-0.4/src/server.c:626 #11 0x00007fede87d2dc5 in start_thread () from /lib64/libpthread.so.0 #12 0x00007fede7d2e28d in clone () from /lib64/libc.so.6 (gdb) list 1298 int err = pn_data_scan(args, "D.[I?Iz.oo.D?LC]", &handle, &id_present, &id, &tag, 1299 &settled, &more, &has_type, &type, transport->disp_data); 1300 if (err) return err; 1301 pn_session_t *ssn = pn_channel_state(transport, channel); 1302 1303 if (!ssn->state.incoming_window) { 1304 return pn_do_error(transport, "amqp:session:window-violation", "incoming session window exceeded"); 1305 } 1306 1307 pn_link_t *link = pn_handle_state(ssn, handle); (gdb) p ssn $1 = <optimized out> (gdb) So, under some conditions, "pn_channel_state(transport, channel)" returns null.
Moving this to an unspecified release until a fix is made available.
Verified in Satellite 6.2.4 async based on the reproducer steps below as well as the no-break automation test results. # rpm -qa | grep qpid-dispatch libqpid-dispatch-0.4-21.el6sat.x86_64 qpid-dispatch-debuginfo-0.4-21.el6sat.x86_64 qpid-dispatch-router-0.4-21.el6sat.x86_64 qpid-dispatch-tools-0.4-21.el6sat.x86_64 # rpm -qa | grep qpid-dispatch qpid-dispatch-debuginfo-0.4-21.el7sat.x86_64 qpid-dispatch-tools-0.4-21.el7sat.x86_64 libqpid-dispatch-0.4-21.el7sat.x86_64 qpid-dispatch-router-0.4-21.el7sat.x86_64 followed the steps outlined in https://bugzilla.redhat.com/show_bug.cgi?id=1312419#c0. however, had to modify the script to be in-line with satelle 6.2. example below. was unable to detect any memory leaks. # cat repro.sh action=install while true; do hammer -u admin -p changeme host package $action --host chd-7 --packages sos if [ "$action" == "install" ]; then action=remove else action=install fi date sleep 5 date done
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:2855