Red Hat Bugzilla – Bug 1299549
Performance degradation when using REST API on WebSphere
Last modified: 2016-09-20 01:13:17 EDT
Description of problem:
REST API calls are becoming significantly slower over time when Business Central is run on WebSphere.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run WasRestPerformanceTest from jBPM Integration test suite 
2. See how test duration time is raising.
The same process run many times is becoming slower and slower. Here are the times to run 100 processes (in ms):
The execution should always take almost the same amount of time.
This is reproducible using both direct REST calls and provided remote client classes. The issue affects only REST interface. When you run the same processes using SOAP or JMS, you will see that there is no performance degradation over time.
This issue causes many problems when running tests from our test suite. The tests run 3-4 hours instead of 20 minutes on other containers. And there are also many failures caused by timeouts.
Tomas, would it be possible for you to take a snapshot of the heap? That way I can at least look at it with MAT or another tool to analyze the memory used.
Please see https://bugzilla.redhat.com/show_bug.cgi?id=1310739 for status updates.
Tomas, is this a regression? If so, do you know what the last version was that was working as expected?
Kris, we discovered this problem on version 6.2. However, it might have been there for a while. The tests on WebSphere have always been a little bit slower than on other containers. But then I wrote extensive REST Query API tests, around 200 simple tests which run together only something more than 3 minutes on most containers and it started to take several hours to run the whole test suite on WebSphere. That was the moment when I noticed something is wrong there and started investigating it.
FYI: at the moment, I've been able to conclude the following:
The problem is *not* a memory leak: if I let a test suite run (for more than 2 hours, roughly 20x), it plateaus at around 11x slower (~700 seconds as opposed to 60) -- and (heap) dumps taken at the end show that the memory usage does *not* increase and otherwise remains steady at roughly 2/3ds of the heap. The reserved memory space (heap) also does not expand.
I'm currently trying to isolate the problem by taking many "thread" (javacore) stack dumps during the test cycle (both at the beginning and end) and thus be able to find any unusual activity.
I will probably/may also end up manually debuging a REST call on the server.
Fixed on 6.4.x. Commits:
Pull request for 6.3.x:
Unfortunately, while the performance problem is fixed, it looks like custom-type serialization is now broken.
Currently working on fixing that..
(custom-type serialization is only broken on websphere, btw).
I'm setting this back to MODIFIED since the fix has already been included in the ER3 tag. The problem with custom type serialization on WebSphere, which was found on community snapshot after the fix for this BZ was provided, will be tracked in another BZ when ER3 is released.
The above problem (custom-type serialization problems after the fix) can be fixed by changing the *configuration* of the websphere instance.
In the custom properties of the JVM instance (reachable via the server configuration), add the following custom property:
Property name: org.apache.wink.jaxbcontextcache
Property value: off
This fixes the custom-type serialization problem.
Also, obviously, it turns off caching for the JAXBContext instances. This makes performance a little bit slower, but not significantly -- especially since for the kie-remote-* services, caching isn't needed in the first place. Furthermore, it's also not needed for the kie-server REST API nor the guvnor REST API.
Verified on BPM Suite 6.3.0 ER3