Bug 1689702

Summary: GC overhead limit exceeded due to org.ovirt.vdsm.jsonrpc.client.events.SubscriptionHolder
Product: Red Hat Enterprise Virtualization Manager Reporter: Germano Veit Michel <gveitmic>
Component: ovirt-engineAssignee: Martin Perina <mperina>
Status: CLOSED ERRATA QA Contact: Lucie Leistnerova <lleistne>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.6CC: emarcus, gdeolive, knoha, lsvaty, mgoldboi, mkalinin, mperina, pkliczew, Rhev-m-bugs, tnisan
Target Milestone: ovirt-4.3.4Keywords: CodeChange, ZStream
Target Release: 4.3.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-jsonrpc-java-1.4.17, ovirt-engine-4.3.4 Doc Type: Enhancement
Doc Text:
A new configuration variable 'EventPurgeTimeoutInHours' has been added to set the number of hours an event can stay in the queue before being cleaned up. The variable can be modified using engine-config. The initial default value is 3 hours.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-20 14:48:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1703275    

Description Germano Veit Michel 2019-03-17 23:41:07 UTC
Description of problem:

2019-02-21 00:42:44,620+09 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] Internal server error GC overhead limit exceeded

The dump shows lots of objects sitting in an event deque for 2 hosts in an environment with just 2 hosts:

Class Name                                                           | Shallow Heap | Retained Heap | Percentage
-----------------------------------------------------------------------------------------------------------------
org.ovirt.vdsm.jsonrpc.client.events.SubscriptionHolder @ 0x6ce539218|           40 | 2,516,477,992 |     61.85%
|- java.util.concurrent.ConcurrentLinkedDeque @ 0x6ce539338          |           24 | 2,516,477,536 |     61.85%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x70b617fb8  |           24 |        28,144 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x71d295f48  |           24 |        28,144 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7a6b1c768  |           24 |        28,144 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb3c9270  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb3c92e8  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x701ff6518  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x701ff6590  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x70b617fe8  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x71d295f60  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7a6b30eb8  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0d4a080  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0d4a0b0  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0d4a170  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0d4a1a0  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a909d0  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a909e8  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90ac0  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90ad8  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90b80  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90b98  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90c70  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90c88  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90d30  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f1a90d48  |           24 |        28,136 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d01dce88  |           24 |        28,128 |      0.00%
|  '- Total: 25 of 105,690 entries; 105,665 more                     |              |               |           
|- java.lang.String[4] @ 0x6ce8d8be8                                 |           32 |           128 |      0.00%
|- java.lang.String @ 0x6ce8d8b70  removed                           |           24 |            96 |      0.00%
|- java.util.ArrayList @ 0x6ce8d8ae0                                 |           24 |            80 |      0.00%
|- java.lang.String @ 0x6ce8d8b30  VM_status                         |           24 |            64 |      0.00%
|- java.util.concurrent.locks.ReentrantLock @ 0x6ce8d8ab0            |           16 |            48 |      0.00%
'- Total: 6 entries                                                  |              |               |           

org.ovirt.vdsm.jsonrpc.client.events.SubscriptionHolder @ 0x6ce8d8398|           40 | 1,436,008,600 |     35.30%
|- java.util.concurrent.ConcurrentLinkedDeque @ 0x6ce8d88e0          |           24 | 1,436,008,144 |     35.30%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7bb691328  |           24 |        45,592 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0f11c28  |           24 |        27,864 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7753385c8  |           24 |        27,864 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x77533f2a0  |           24 |        27,864 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6f0f11c58  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb63d848  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb648a80  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb661620  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fb6682f0  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fbacdb98  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6fbacdbb0  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x73881a650  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x73881a668  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7389da8c8  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x7389da8f8  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x748f873b8  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x748f9b9e8  |           24 |        27,856 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d0a9d718  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d150ae98  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d2fd4688  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d31c8688  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d33ab6c0  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d37fbe40  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d3a17c80  |           24 |        27,848 |      0.00%
|  |- java.util.concurrent.ConcurrentLinkedDeque$Node @ 0x6d43f63d8  |           24 |        27,848 |      0.00%
|  '- Total: 25 of 58,946 entries; 58,921 more                       |              |               |           
|- java.lang.String[4] @ 0x6ce8d8910                                 |           32 |           128 |      0.00%
|- java.lang.String @ 0x6ce8d89d0  removed                           |           24 |            96 |      0.00%
|- java.util.ArrayList @ 0x6ce8d8a30                                 |           24 |            80 |      0.00%
|- java.lang.String @ 0x6ce8d8960  VM_status                         |           24 |            64 |      0.00%
|- java.util.concurrent.locks.ReentrantLock @ 0x6ce8d8a80            |           16 |            48 |      0.00%
'- Total: 6 entries                                                  |              |               |           

Version-Release number of selected component (if applicable):
rhvm-4.2.6.4-0.1.el7ev.noarch

How reproducible:
Unknown

Comment 3 Germano Veit Michel 2019-03-17 23:48:32 UTC
Hosts run vdsm-4.20.39.1-1.el7ev

Comment 4 Germano Veit Michel 2019-03-18 00:02:32 UTC
vdsm-jsonrpc-java-1.4.14-1.el7ev

Comment 17 Keigo Noha 2019-04-19 00:59:59 UTC
Hi Ravi,

Thank you for your work on this bugzilla.
Do we still need to have logs in RHEV host side?

Best Regards,
Keigo Noha

Comment 22 RHV bug bot 2019-05-16 15:29:21 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops

Comment 25 Martin Perina 2019-06-12 16:44:04 UTC
Looks good to me, thanks

Comment 27 errata-xmlrpc 2019-06-20 14:48:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1566

Comment 28 Daniel Gur 2019-08-28 13:11:27 UTC
sync2jira

Comment 29 Daniel Gur 2019-08-28 13:15:39 UTC
sync2jira