Bug 1250310

Summary:	CPU usage of etcd is too high after setting up etcd cluster
Product:	OpenShift Container Platform	Reporter:	Gaoyun Pei <gpei>
Component:	Installer	Assignee:	Scott Dodson <sdodson>
Status:	CLOSED WORKSFORME	QA Contact:	Ma xiaoqiang <xiama>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.0.0	CC:	eparis, gpei, jchaloup, jokerman, libra-bugs, libra-onpremise-devel, matt, mmccomas, tstclair, xtian
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-09-10 17:07:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1250707
Bug Blocks:

Description Gaoyun Pei 2015-08-05 06:25:16 UTC

Description of problem:
Install an ose v3 env with a clustered etcd, check the information about etcd process, found the leader etcd service cost 80% CPU and the followers take up about 50% CPU.

While setting up env with a single external etcd host, the etcd process only takes about 2%~3% CPU.

Version-Release number of selected component (if applicable):
https://github.com/openshift/openshift-ansible.git master
etcd-2.0.11-2.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Configure etcd hosts in ansible host file, run the installation.

[root@etcd-1 ~]# etcdctl --peers xxx --cert-file xxx --key-file xxx --ca-file xxx cluster-health
cluster is healthy
member 1fc2968f96e0ed43 is healthy
member a62112f2cbe844cf is healthy
member fd1311b9475ca6ce is healthy

leader:
[root@etcd-3 ~]# ps aux|head -1 ; ps aux|grep etcd|grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 5686 80.0 0.5 26608 21856 ? Ssl 11:19 1:02 /usr/bin/etcd

follower:
[root@etcd-1 ~]#
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 7899 43.8 0.5 25516 20648 ? Ssl 11:19 0:29 /usr/bin/etcd
[root@etcd-2 ~]#
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 5554 46.3 0.8 35320 31064 ? Rsl 11:19 0:32 /usr/bin/etcd

Actual results:

Expected results:
The resource cost by etcd should be in a reasonable range.

Additional info:
Refer to https://github.com/coreos/etcd/blob/master/Documentation/tuning.md,
change ETCD_HEARTBEAT_INTERVAL="200" (default 100), restart all the etcd service, the CPU cost reduce significantly.

leader:
[root@etcd-1 ~]#
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 7938 48.4 0.5 26608 22208 ? Ssl 11:26 0:28 /usr/bin/etcd

follower:
[root@etcd-2 ~]#
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 5578 29.0 0.8 36408 31620 ? Ssl 11:26 0:18 /usr/bin/etcd
[root@etcd-3 ~]#
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
etcd 5741 25.5 0.5 26608 22020 ? Ssl 11:26 0:16 /usr/bin/etcd

Comment 1 Scott Dodson 2015-08-05 13:41:50 UTC

Tim,

Have you seen anything like this? Here's the etcd.config template we're using, are there other tuning changes we should make?

https://github.com/openshift/openshift-ansible/blob/master/roles/etcd/templates/etcd.conf.j2

Comment 2 Timothy St. Clair 2015-08-05 14:19:26 UTC

@Scott with our wide open raw k8's env we do not see this type of behavior. 

So I have several questions: 

1. Are you using the an external etcd now?  e.g. using the actual unit files vs. your previously bundled version. If so there is an enablement of GOMAXPROCS in that unit file but that doesn't explain the overload.

2. Are you starting from a clean etcd baseline?  wiped the entire contents of /var/lib/etcd ... 

3. Can you scrape the api /metrics endpoints to find out what are the offending calls that are eating all the bandwidth.  Write(s) will now be much more expensive, and if there are a lot of writes on the openshift that don't exist in raw k8's then there will be an issue.  Especially if there are overlapping list operations along with writes.

Comment 4 Timothy St. Clair 2015-08-05 18:55:14 UTC

Appears to be tls on peer connections.

Comment 5 Timothy St. Clair 2015-08-05 19:02:08 UTC

You may need an etcd bump - 
https://github.com/coreos/etcd/issues/2539 

<tstclair> yichengq_ ping.. Do you regularly test with tls on peer connections?  We are seeing a huge perf hit on peer tls. https://bugzilla.redhat.com/show_bug.cgi?id=1250310
<yichengq_> tstclair: 2.0 could have this problem because transport layer is not optimized
<yichengq_> tstclair: we have improved it at 2.1

We are running: etcd Version: 2.1.1+git on our cluster, I will try to reconfigure our setup and verify.

Comment 6 Scott Dodson 2015-08-05 19:16:32 UTC

Removing TLS from peer connections reduced the cpu usage down to around 2%.

Upgrading to 2.1 and re-enabling TLS shows CPU usage stays around 2% as well.

Comment 8 Scott Dodson 2015-08-05 20:48:28 UTC

Partially remediated by https://github.com/openshift/openshift-ansible/pull/427

On my three vms this cut usage in half, but that's on a cluster that's pretty much idle. I'm not sure how well this remediation scales to an active cluster. Setting this ON_QA to get feedback from QA as to how much it improves the situation in their testing. I don't see this as a fix however.

Comment 9 Gaoyun Pei 2015-08-06 09:18:12 UTC

After setting up env using the new openshift-ansible, the cpu usage taken by etcd reduced a lot.

[root@etcd-1 ~# ps aux|grep etcd
etcd      8800 12.9  0.5  26608 22716 ?        Ssl  13:55   0:16 /usr/bin/etcd
[root@etcd-2 ~]# ps aux|grep etcd
etcd      6414 14.9  0.8  36408 32048 ?        Ssl  13:55   0:17 /usr/bin/etcd
[root@etcd-3 ~]# ps aux |grep etcd
etcd      6531 22.4  0.5  27696 23192 ?        Ssl  13:55   0:23 /usr/bin/etcd


Made a small comparison between the new env and the old one. Create projects concurrently, and get the average creation time of each project.

Creating 30 projects concurrently: 16.309s(new) vs. 17.0467s(old)

Creating 100 projects concurrently: 48.8168s(new) vs. 45.2532s(old), however, there're 29 requests failed on the old env due to TLS handshake timeout, all the 100 requests succeed on the new nev.

During the test, the cpu usage of leader etcd in the new env topped to 36.5%, then start reducing when requests finished, while the leader etcd in the old env once hit 107% CPU usage.

Overall, the new env works better than the old one in my testing.


From QE side, this issue mainly affects the full functional testing when the three etcd server were installed on master/node1/node2, this is done to save instance usage on OpenStack. Build or deployment would fail somewhile since the etcd takes too much cpu resource on nodes. I'd prefer to mark this as verified if the etcd really working well during next round of testing.

Comment 10 Scott Dodson 2015-08-06 13:23:33 UTC

Gaoyun,

OK, if you make it through the full functional suite we can mark this as verified.

--
Scott

Comment 11 Scott Dodson 2015-09-10 17:07:13 UTC


*** This bug has been marked as a duplicate of bug 1250707 ***

Comment 12 Gaoyun Pei 2015-09-11 07:01:33 UTC

During OSE-3.0.2 full functional testing, QE didn't encounter the same issue happened in OSE-3.0.1. No build or deployment failed due to slow system performance.

QE monitored the etcd cpu usage from beginning to end, the leader etcd topped to 34.2% cpu and the follower etcd topped to 18.2% cpu.

It turns out the tuning to etcd is an acceptable workaround for OSE-3.0.x with etcd 2.0, so I'll change this as WORKSFORME.

etcd version: etcd-2.0.13-2.el7.x86_64.

Thanks.

Comment 13 Timothy St. Clair 2015-09-11 14:01:08 UTC

I believe etcd 2.1.1 is already in RHEL-Extras channel.

Comment 14 Jan Chaloupka 2015-09-11 14:18:47 UTC

Hi Gaoyun,

can you test the same for etcd-2.1.1 [1] if there is no regression? Or was it already tested with 2.1.1?

[1] https://brewweb.devel.redhat.com/buildinfo?buildID=452521

Jan

Comment 15 Gaoyun Pei 2015-09-14 09:37:06 UTC

Hi Jan,

Thanks for providing the etcd-2.1.1 package, I tried building ose-3.0.2 env with etcd-2.1.1 cluster today. The cpu usage of etcd is around 1%~3%, which is much reduced.   

QE will start using etcd-2.1.1 in ose/aep from next round of testing.

(In reply to Jan Chaloupka from comment #14)
> Hi Gaoyun,
> 
> can you test the same for etcd-2.1.1 [1] if there is no regression? Or was
> it already tested with 2.1.1?
> 
> [1] https://brewweb.devel.redhat.com/buildinfo?buildID=452521
> 
> Jan