Bug 1703581

Summary: Default space quota in etcd is not enough for large scale clusters
Product: OpenShift Container Platform Reporter: Naga Ravi Chaitanya Elluri <nelluri>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED CURRENTRELEASE QA Contact: ge liu <geliu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: akamra, akrzos, dgoodwin, gblomqui, jeder, jupierce, nelluri, nmalik, scuppett, tkatarki
Target Milestone: ---Keywords: OpsBlocker
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard: aos-scalability-41
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-10 11:43:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
etcd member logs
none
Controller logs none

Description Naga Ravi Chaitanya Elluri 2019-04-26 19:00:23 UTC
Created attachment 1559333 [details]
etcd member logs

Description of problem:
We ran a scale test which created around 9500 namespaces with bunch of objects on a 250 nodes cluster and found out that etcd was complaining about exceeding database space and raised an alarm and put the cluster into a maintenance mode. It was only accepting key reads and deletes. We had to run defragmentation to release the compacted space for the DB to use it and disable the alarm to get the cluster back to functional state.

Version-Release number of selected component (if applicable):
Etcd Version: 3.3.10
OCP: 4.1 beta4/4.1.0-0.nightly-2019-04-22-005054
Installer: v4.1.0-201904211700-dirty


How reproducible:
We encountered this issue for the first time but I think this can can be easily reproduced if the etcd defragmentaion is not done regularly and if space quota is not enough for a large scale cluster with lot of objects running.


Steps to Reproduce:
1. Install a large scale cluster using the default space quotas for etcd.
2. Load the cluster with bunch of objects.
3. Check the etcd component status, endpoints and alarm status.

Actual results:
- etcd component status was unhealthy
- etcd server got overloaded
- controller and etcd logs reported - "etcdserver: mvcc: database space exceeded". The DB size was 2.2G when we hit the issue.

Expected results:
- The default space quota should be at least 4GB, the cluster should be functional.  
- Components including etcd reporting healthy state.
- Prometheus rule to alert users to run defragmention when needed.

Additional info:

Comment 1 Naga Ravi Chaitanya Elluri 2019-04-26 19:01:14 UTC
Created attachment 1559345 [details]
Controller logs

Comment 2 Jeremy Eder 2019-04-29 15:30:36 UTC
Is there any technical issue with making the default etcd DB size large enough to handle the largest cluster that we support?   This way we never have to tune it?

Comment 3 Sam Batschelet 2019-04-29 15:54:05 UTC
> Is there any technical issue with making the default etcd DB size large enough to handle the largest cluster that we support?   This way we never have to tune it?

Yes, the main issue is the db size that the cluster can support is greatly dependent on not only workload/number of nodes but hardware. Speed of disks, dedicated vs colocated data-dir partitions, RAM/CPU procs all play a part of this equation as well. While hitting alarm is not what we want we also don't want a customer with a 8GB etcd db that we can never stabilize without new hardware. So we are not yet in a set it and forget it situation. At scale, we will need to properly tune to optimize performance.

I feel setting a sane default such as 4GB is reasonable for now. In the future, we can look to auto-tune based on ENV etc. We need more tools to help folks maintain etcd, that should improve. If we explicitly know all variables of deployment hardware we can revisit default based on common deployments.

Comment 21 ge liu 2019-05-08 08:05:30 UTC
Based on long comments list, this issue is not a simple one, and perhaps we may open a RFE to discuss deeply if necessary, currently, the etcd space quota is 4GB.

Comment 26 ge liu 2019-05-10 08:20:15 UTC
Verified with Beta 5 Final Build(4.1.0-rc.1), 

#[storage]
ETCD_QUOTA_BACKEND_BYTES=7516192768