Bug 1703581

Summary:

Default space quota in etcd is not enough for large scale clusters

Product:

OpenShift Container Platform

Reporter:

Naga Ravi Chaitanya Elluri <nelluri>

Component:

Etcd

Assignee:

Sam Batschelet <sbatsche>

Status:

CLOSED CURRENTRELEASE

QA Contact:

ge liu <geliu>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

4.1.0

CC:

akamra, akrzos, dgoodwin, gblomqui, jeder, jupierce, nelluri, nmalik, scuppett, tkatarki

Target Milestone:

---

Keywords:

OpsBlocker

Target Release:

4.1.0

Hardware:

Unspecified

OS:

Linux

Whiteboard:

aos-scalability-41

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-05-10 11:43:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
etcd member logs	none
Controller logs	none

Description Naga Ravi Chaitanya Elluri 2019-04-26 19:00:23 UTC

Created attachment 1559333 [details]
etcd member logs

Description of problem:
We ran a scale test which created around 9500 namespaces with bunch of objects on a 250 nodes cluster and found out that etcd was complaining about exceeding database space and raised an alarm and put the cluster into a maintenance mode. It was only accepting key reads and deletes. We had to run defragmentation to release the compacted space for the DB to use it and disable the alarm to get the cluster back to functional state.

Version-Release number of selected component (if applicable):
Etcd Version: 3.3.10
OCP: 4.1 beta4/4.1.0-0.nightly-2019-04-22-005054
Installer: v4.1.0-201904211700-dirty


How reproducible:
We encountered this issue for the first time but I think this can can be easily reproduced if the etcd defragmentaion is not done regularly and if space quota is not enough for a large scale cluster with lot of objects running.


Steps to Reproduce:
1. Install a large scale cluster using the default space quotas for etcd.
2. Load the cluster with bunch of objects.
3. Check the etcd component status, endpoints and alarm status.

Actual results:
- etcd component status was unhealthy
- etcd server got overloaded
- controller and etcd logs reported - "etcdserver: mvcc: database space exceeded". The DB size was 2.2G when we hit the issue.

Expected results:
- The default space quota should be at least 4GB, the cluster should be functional.  
- Components including etcd reporting healthy state.
- Prometheus rule to alert users to run defragmention when needed.

Additional info:

Comment 1 Naga Ravi Chaitanya Elluri 2019-04-26 19:01:14 UTC

Created attachment 1559345 [details]
Controller logs

Comment 2 Jeremy Eder 2019-04-29 15:30:36 UTC

Is there any technical issue with making the default etcd DB size large enough to handle the largest cluster that we support?   This way we never have to tune it?

Comment 3 Sam Batschelet 2019-04-29 15:54:05 UTC

> Is there any technical issue with making the default etcd DB size large enough to handle the largest cluster that we support?   This way we never have to tune it?

Yes, the main issue is the db size that the cluster can support is greatly dependent on not only workload/number of nodes but hardware. Speed of disks, dedicated vs colocated data-dir partitions, RAM/CPU procs all play a part of this equation as well. While hitting alarm is not what we want we also don't want a customer with a 8GB etcd db that we can never stabilize without new hardware. So we are not yet in a set it and forget it situation. At scale, we will need to properly tune to optimize performance.

I feel setting a sane default such as 4GB is reasonable for now. In the future, we can look to auto-tune based on ENV etc. We need more tools to help folks maintain etcd, that should improve. If we explicitly know all variables of deployment hardware we can revisit default based on common deployments.

Comment 21 ge liu 2019-05-08 08:05:30 UTC

Based on long comments list, this issue is not a simple one, and perhaps we may open a RFE to discuss deeply if necessary, currently, the etcd space quota is 4GB.

Comment 26 ge liu 2019-05-10 08:20:15 UTC

Verified with Beta 5 Final Build(4.1.0-rc.1), 

#[storage]
ETCD_QUOTA_BACKEND_BYTES=7516192768