Hide Forgot
Created attachment 1197102 [details] Logs and config files During a write-heave load test JDG throws an OutOfMemoryError on the backup data center. This happens after ~10 minutes of heavy load (6 HotRod client writing without any delay between requests, the overall load is about 1300 reqs/s with 33kB values, write-only). Description of test scenario: * 2 data centers (data center LON with nodes A,B; data center NYC with nodes C,D) with two JDG servers in each DC * six HotRod clients writing data only in LON (33kB values, writing as quickly as possible) * ASYNC replication between DCs * JGroups is using multiple site masters set to 2 (all nodes are site masters) The logs from individual nodes show the following pattern: 1) node C (in receiving data center NYC): [GC (Allocation Failure) [PSYoungGen: 1048576K->56409K(1223168K)] 1133628K->141469K(4019712K), 0.1291044 secs] [Times: user=0.46 sys=0.01, real=0.13 secs] 2) node C: java.lang.OutOfMemoryError: Java heap space 3) node A (in sending data center LON): WARN [org.jgroups.protocols.TCP] (HotRodServerWorker-3) Discarding message because TCP send_queue is full and hasn't been releasing for 300 ms 4) node A: WARN [org.jgroups.protocols.TCP] (ConnectionMap.Acceptor [172.18.1.4:7610]) JGRP000006: failed accepting connection from peer: java.net.SocketTimeoutException: Read timed out 5) node A: ERROR [org.jgroups.protocols.relay.RELAY2] (HotRodServerWorker-3) node0/LON: no route to NYC: dropping message We also created a heap dump on OOM error on node C and the interesting part is following: Class Name | Shallow Heap | Retained Heap ------------------------------------------------------------------- org.jgroups.util.Table @ 0x7002337c0| 112 | 3,043,681,288 org.infinispan.container.DefaultDataContainer @ 0x700190b50| 56 | 331,759,576 ------------------------------------------------------------------- Note: Overall heap is 4 GB. We keep writing only ten thousand entries, 33kB each, which gives 330 MB overall (this corresponds to the data container value above). Attaching logs and config files. Nodes edg-perf02, edg-perf03 are nodes A,B from the description above; edg-perf04, edg-perf05 are nodes C,D; nodes edg-perf06, edg-perf07 are nodes with HotRod clients (but only edg-perf06 writes data)