1520794 – etcdserver: mvcc: database space exceeded issue after upgrade from OCP 3.6 to 3.7

Bug 1520794 - etcdserver: mvcc: database space exceeded issue after upgrade from OCP 3.6 to 3.7

Summary: etcdserver: mvcc: database space exceeded issue after upgrade from OCP 3.6 t...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Master
Sub Component:
Version:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Stefan Schimanski
QA Contact:	Wang Haoran
Docs Contact:
URL:
Whiteboard:
Depends On:	1514612
Blocks:
TreeView+	depends on / blocked

Reported:	2017-12-05 07:36 UTC by Miheer Salunke
Modified:	2023-09-15 00:05 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-01-08 22:30:01 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 4 Michal Fojtik 2017-12-05 09:33:50 UTC

Have you tried the defrag procedure for etcd? The procedure is roughly explained in this Github comment: https://github.com/kubernetes/kops/issues/4005#issuecomment-349048006

Comment 5 Ryan Howe 2017-12-05 16:48:07 UTC

Yes the following steps were run: 

~~~
# export ETCDCTL_API=3
# source /etc/etcd/etcd.conf
# rev=$(etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_LISTEN_CLIENT_URLS endpoint status --write-out="json" |  egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*' -m1)
# etcdctl --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_LISTEN_CLIENT_URLS compact $rev
# etcdctl --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_LISTEN_CLIENT_URLS defrag
# etcdctl3 --cert=$ETCD_PEER_CERT_FILE --key=$ETCD_PEER_KEY_FILE --cacert=$ETCD_TRUSTED_CA_FILE --endpoints=$ETCD_LISTEN_CLIENT_URLS alarm disarm
~~~

I think the following issue is being hit: 

https://github.com/kubernetes/kubernetes/issues/45037
https://github.com/coreos/etcd/issues/8009
https://github.com/coreos/etcd/pull/8210

Comment 10 Eric Rich 2018-01-08 22:30:01 UTC

Ultimately this diagnosed to be a configuration issue! With one ETCD host not being configured with the recommended 4GB quota limit defaulting to 2GB quota limit. 

If this is still an issue please file a new BZ capturing the issue.

Comment 11 Red Hat Bugzilla 2023-09-15 00:05:30 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.