Description of problem: by investigating this BZ 1269424 we found a leakage around json/decoder.py:381 in time we use xmlrpc as channel type. most of the decode calls comes from: guestagent.py:482 -------- uniline = line.decode('utf8', 'replace') vm.py:1617\1615 misc.py:953 seems like we have wrong deceleration. Version-Release number of selected component (if applicable): master branch 4.17.999 How reproducible: 100% Steps to Reproduce: 1. ruining 57 vms + guest agent + 12 SD's. Actual results: json/decoder.py has been called and drives the leak. Expected results: json/decoder.py should not be called when using xmlrpc. Additional info:
Eldad, I do not understand this bug report. How did you measure the memory leak? How big is it? Why do you think that it is caused by json/decoder.py ? We use json in the code for things other than jsonrpc.
(In reply to Eldad Marciano from comment #0) > Actual results: > json/decoder.py has been called and drives the leak. > > Expected results: > json/decoder.py should not be called when using xmlrpc. This is a bit too simplistic. We didn't expect these numbers in this flow, and this report is an hint we should also look elsewhere, *perhaps* in the Guest Agent area, because the traffic between VDSM and GA is in json rpc, regardless what Engine wants to use. This doesn't mean we should use json at all in this flow. Furthermore, not sure json "drives" the leak here. Surely it dominates the consumption, but this can be because few reasons. For example, if the VDSM thread that handles GA message is too slow, it can pile up decoded data to process. But if we give it more time, it can process the queue actually solving "the leak" A proper leak is when we forget the reference to some data, so we can't ever release its memory, or when we miss to release this data, like https://gerrit.ovirt.org/#/c/48616/
(In reply to Eldad Marciano from comment #0) > > most of the decode calls comes from: > guestagent.py:482 -------- uniline = line.decode('utf8', 'replace') > vm.py:1617\1615 > misc.py:953 > Are you sure that the places you pointed to are not sting.decode? > > > Expected results: > json/decoder.py should not be called when using xmlrpc. > Your expected assumption is wrong because json library is not only used by jsonrpc.
(In reply to Dan Kenigsberg from comment #1) > Eldad, I do not understand this bug report. How did you measure the memory > leak? How big is it? Why do you think that it is caused by json/decoder.py ? > > We use json in the code for things other than jsonrpc. we taking memory dump every 1 hour once vdsm restarted for 12 hours (12 snapshots). we compared the tracemalloc snapshots the first one and the last one, see the results: * Top 10 lines: ---------------------------------------------------- # 1: json/decoder.py:381 : 6069.27 KiB (3.85%) # 2: site-packages/libvirt.py:5149 : 0.00 KiB (0.00%) # 3: site-packages/libvirt.py:5102 : 2953.84 KiB (1.87%) # 4: python2.7/subprocess.py:1407 : 1000.04 KiB (0.63%) # 5: python2.7/xmlrpclib.py:737 : 715.58 KiB (0.45%) # 6: python2.7/xmlrpclib.py:735 : 410.95 KiB (0.26%) # 7: profiling/memory.py:80 : 1523.45 KiB (0.97%) # 8: python2.7/xmlrpclib.py:735 : 230.03 KiB (0.15%) # 9: ioprocess/__init__.py:351 : 47.30 KiB (0.03%) # 10: python2.7/genericpath.py:102 : 181.77 KiB (0.12%)
(In reply to Piotr Kliczewski from comment #3) > (In reply to Eldad Marciano from comment #0) > > > > most of the decode calls comes from: > > guestagent.py:482 -------- uniline = line.decode('utf8', 'replace') > > vm.py:1617\1615 > > misc.py:953 > > > > Are you sure that the places you pointed to are not sting.decode? > > > > > > > Expected results: > > json/decoder.py should not be called when using xmlrpc. > > > > Your expected assumption is wrong because json library is not only used by > jsonrpc. Due to the tracemalloc results i look for who using decode (json/decoder.py:381) method in vdsm code (simple ctrl + G). the list above as the 'find usages' results. actually this very strange cause i tired to load the json module by the following simple code: json.load(open(jsonfile, 'r')) json.JSONDecoder.decode() full code here - http://pastebin.test.redhat.com/330223 and i found stable memory for different labs (json libs'). still investigating
I'm not sure what the status of this bug...
In comment #5 you stated that you are still investigating. Any updates for this BZ?
Eldad, ping?
it was pending for a while, we still not investigate it deeper. we should re-priorities this bug, will update ASAP Gil what do you think?
Eldad, Gil, any updates?
@Eldad, Gil: is this still relevant?
we should re-investigate it. meanwhile reducing priority
seems like its not reproduced anymore, by the last scale our testing om idle vdsm with ~70 vms ~12 SD's active - no leaks were found. vdsm-4.18.0-0.el7ev.x86_64