Bug 1501364

Summary: Memory leak (>1G) in spikes while running spice-gtk
Product: [Fedora] Fedora Reporter: Christophe de Dinechin <dinechin>
Component: spice-gtkAssignee: Christophe de Dinechin <dinechin>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: alon, bdpepple, berrange, cfergeau, fidencio, hdegoede, marcandre.lureau, sandmann, uraeus, victortoso, wtaymans
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-29 11:53:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Heaptrack screen shot showing the spikes
none
Heaptrack trace showing the problem
none
A case where I had 6GB leak in 80 seconds none

Description Christophe de Dinechin 2017-10-12 12:56:29 UTC
Created attachment 1337743 [details]
Heaptrack screen shot showing the spikes

Description of problem:
When running spice-gtk for extended periods of time in "streaming agent" mode (H264 stream), the application may end up using several GB of memory instead of its usual 100M. Using heaptrack, this was traced to the following stack trace in gstreamer:

g_malloc in ?? (libglib-2.0.so.0)
g_slice_alloc in ?? (libglib-2.0.so.0)
_sysmem_new_block in gstallocator.c:417 (libgstreamer-1.0.so.0)
default_alloc in gstallocator.c:516 (libgstreamer-1.0.so.0)
gst_allocator_alloc in gstallocator.c:314 (libgstreamer-1.0.so.0)
gst_buffer_new_allocate in gstbuffer.c:839 (libgstreamer-1.0.so.0)
video_buffer_pool_alloc in gstvideopool.c:254 (libgstvideo-1.0.so.0)
do_alloc_buffer in gstbufferpool.c:278 (libgstreamer-1.0.so.0)
default_acquire_buffer in gstbufferpool.c:1134 (libgstreamer-1.0.so.0)
gst_buffer_pool_acquire_buffer in gstbufferpool.c:1265 (libgstreamer-1.0.so.0)
default_prepare_output_buffer in gstbasetransform.c:1597 (libgstbase-1.0.so.0)
default_generate_output in gstbasetransform.c:2094 (libgstbase-1.0.so.0)
gst_base_transform_chain in gstbasetransform.c:2276 (libgstbase-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_base_transform_chain in gstbasetransform.c:2312 (libgstbase-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_base_transform_chain in gstbasetransform.c:2312 (libgstbase-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_base_transform_chain in gstbasetransform.c:2312 (libgstbase-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_base_transform_chain in gstbasetransform.c:2312 (libgstbase-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_proxy_pad_chain_default in gstghostpad.c:127 (libgstreamer-1.0.so.0)
gst_pad_chain_data_unchecked in gstpad.c:4215 (libgstreamer-1.0.so.0)
gst_pad_push_data in gstpad.c:4471 (libgstreamer-1.0.so.0)
gst_pad_push in gstpad.c:4590 (libgstreamer-1.0.so.0)
gst_queue_push_one in gstqueue.c:1383 (libgstcoreelements.so)
gst_queue_loop in gstqueue.c:1536 (libgstcoreelements.so)
gst_task_func in gsttask.c:332 (libgstreamer-1.0.so.0)
default_func in gsttaskpool.c:69 (libgstreamer-1.0.so.0)
<unresolved function> in ?? (libglib-2.0.so.0)
<unresolved function> in ?? (libglib-2.0.so.0)
start_thread in ?? (libpthread.so.0)
__GI___clone in ?? (libc.so.6)

Version-Release number of selected component (if applicable):
Experimental spice-gtk running against a "streaming agent", which uses H264 protocol.
Tested both with gstreamer1-1.12.2-1.fc26.src.rpm and with a git build of commit 6617b01af92b72f5ee348be111bf3a2bb2288ea1

How reproducible: Always, although it may take hours to show up. See "More info" for "facilitators"

Steps to Reproduce:
1. Run a spice server with streaming agent configuration (H264 stream)
2. Run a spice-gtk client against the server (e.g spicy -h myhost -p 5901)
3. Run something on the server that causes a large number of screen changes (e.g. movie playback in a loop)
4. Wait (this may take a few hours)


Actual results:
Memory leak spikes, in the order of several hundred megabytes, occuring at irregular intervals. See attachments.


Expected results:
No memory leak :-)


Additional info:
I noticed that suspending and resuming the client process with Control-Z + fg often triggers the problem wihin a few seconds. Normal "overloading" of the client CPU (e.g. spinners) does not seem to make much of a difference, though.

Comment 1 Christophe de Dinechin 2017-10-12 13:11:09 UTC
Created attachment 1337770 [details]
Heaptrack trace showing the problem

This is a heaptrack trace that shows the problem. It was taken after suspending the spice client twice with Control-Z, wihch I believe corresponds to the two observed spikes.

Comment 2 Christophe de Dinechin 2017-10-12 13:38:40 UTC
Created attachment 1337791 [details]
A case where I had 6GB leak in 80 seconds

This is an interesting case where I saw a 6GB leak in about 80 seconds (starting at 40s).

One interesting and somewhat unique feature of that case is that the increase seems to be more progressive than in a huge step

Comment 3 Christophe de Dinechin 2017-10-12 13:58:53 UTC
Hitting Control-Z and then 'fg' the job seems to work quite often. So I thought it might be some race condition or threading issue. But I was not successful reproducing the problem with 300 spinners.

Comment 4 Christophe de Dinechin 2017-10-13 10:14:31 UTC
Further analysis (see http://blackbox.dinechin.org/171013.html) indicate that the problem apparently originates in the spice-gtk client. Apparently, when the  problem occurs, time goes backwards. I believe gstreamer gstpad scheduling may not be robust to that case, and may leak when that happens.

Comment 5 Christophe de Dinechin 2017-10-13 10:16:29 UTC
Changed project based on latest findings. Issue is probably in how we schedule, although there may be a bug in gstreamer leaking memory when the schedule time for frames is not monotonic.

Comment 6 Marc-Andre Lureau 2018-02-25 22:24:16 UTC
assigning to Christophe, who did analysis. Did you also submit fixes?

Comment 7 Fedora End Of Life 2018-05-03 08:30:32 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 8 Fedora End Of Life 2018-05-29 11:53:25 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.