Bug 1376822 - Write performance of hawkular services running in docker container is 2 times worst compared to VM
Summary: Write performance of hawkular services running in docker container is 2 times...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Middleware Manager
Classification: JBoss
Component: Other
Version: 7.0.0 TP2
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Paul Gier
QA Contact: Filip Brychta
URL:
Whiteboard: hawkular
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-16 13:55 UTC by Filip Brychta
Modified: 2018-01-04 15:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-04 15:36:16 UTC


Attachments (Terms of Use)

Description Filip Brychta 2016-09-16 13:55:29 UTC
Description of problem:
Setup 1:
VM with 2 CPUs and 4 GB of memory running cassandra and hawkular services directly

Setup 2:
VM with 2 CPUs and 4 GB of memory running cassandra and hawkular services docker containers

Write performance of setup 1 is 2 times better than setup 2.


Version-Release number of selected component (if applicable):
Hawkular services DR01

How reproducible:
Always

Steps to Reproduce:
1. create 2 same VMs
2. start cassandra and hawkular services on VM1 (directly using upstream zip build)
3. start cassandra and hawkular services docker containers on VM2
4. run write performance tests against each VM

Actual results:
Throughput on VM1 is 2 times better than on VM2

Expected results:
It is expected that docker containers bring some overheads but it should be investigated if such a big difference is acceptable.

Additional info:
iostat shows that both VMs are using 100% of CPU but VM2 is spending more time on system operations

VM1:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          81.84    7.95    8.25    1.25    0.15    0.55

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.00     0.90    0.00    5.59     0.00  2080.72   743.86     2.85  508.71    0.00  508.71  23.48  13.14
dm-0              0.00     0.00    0.00    6.19     0.00  2080.72   671.87     2.96  477.47    0.00  477.47  21.21  13.14
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00



VM2:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          68.93    6.12   21.97    0.24    2.50    0.24

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda              44.80     4.90    6.50    6.40   205.20  1378.80   245.58     0.82   63.54    4.51  123.50   9.43  12.17
dm-0              0.00     0.00    0.10   10.50     0.40  1378.80   260.23     0.94   88.28  169.00   87.51  10.37  10.99
dm-1              0.00     0.00   51.20    0.00   204.80     0.00     8.00     0.10    1.91    1.91    0.00   0.23   1.18
dm-2              0.00     0.00    0.30    0.10     1.20     0.40     8.00     0.02   42.25   56.33    0.00  42.25   1.69
dm-3              0.00     0.00    0.30    0.10     1.20     0.40     8.00     0.02   44.00   58.67    0.00  44.00   1.76
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Comment 3 Paul Gier 2016-09-29 20:38:29 UTC
I came across some info related to the poor performance using the default docker network config which uses a virtual bridge (docker0) to route requests between containers (https://github.com/docker/docker/issues/7857).

When starting the containers, can you try setting them to connect directly to the host network (--net=host) instead of the default settings?

docker run --name hawkular-cassandra --net=host -d -e CASSANDRA_START_RPC=true brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/jboss/cassandra:latest

docker run -d --net=host -e CASSANDRA_NODES=localhost -e HAWKULAR_BACKEND=remote -e DB_TIMEOUT=20 -p 8080:8080 -p 8443:8443 -p 9990:9990 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/hawkular/hawkular-services:latest

Also note that when using the host network directly, docker will not automatically update firewall config, so you may need to manually open port 8080 to connect to hawkular-services from a remote host.

firewall-cmd --zone=public --add-port=8080/tcp

If necessary, you can later undo the firewall change using this:

firewall-cmd --zone=public --remove-port=8080/tcp

Comment 4 Filip Brychta 2016-10-03 16:40:39 UTC
Retested with --net=host without any significant performance improvement.

Also retested on more powerful VMs with 4 CPUs.
Performance difference (container vs. direct zip installation) in this case was only 25%

Comment 5 Mike Foley 2016-10-06 15:30:22 UTC
Next action item is for QE to baseline write disk i/o  on VM and container ...to determine if disk i/o not involving HawkularServices is fundamentally different in and out of a containr.  

Results will be documented here.  Will be discussed at the next performance call.  And next action item / iteration determined.

Comment 6 Mike Foley 2016-10-10 15:01:44 UTC
adding "documentation" keyword.  the output of this bugzilla will likely need to be communicated in some customer-facing documentation or guidance.

Comment 7 Heiko W. Rupp 2016-10-10 15:18:28 UTC
"
It's using the loop device.  please use the docker thin pool according to these docs:
https://access.redhat.com/documentation/en/red-hat-enterprise-linux-atomic-host/7/paged/getting-started-with-containers/chapter-7-managing-storage-with-docker-formatted-containers

There's even a warning in the docker info output.
"


Note You need to log in before you can comment on or make changes to this bug.