Bug 1290839

Summary: Performance Degradation With 3 node etcd cluster all in vms on same system
Product: Red Hat Enterprise Linux 7 Reporter: Devan Goodwin <dgoodwin>
Component: etcdAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED INSUFFICIENT_DATA QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: agoldste, rrati, tstclair
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-24 13:29:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Vagrantfile used to bring up this environment.
none
etcd.conf from one of the affected cluster members none

Description Devan Goodwin 2015-12-11 15:35:04 UTC
Created attachment 1104731 [details]
Vagrantfile used to bring up this environment.

Description of problem:

This issue surfaced in attempted local testing of an HA openshift environment with 3 openshift masters each running a separate etcd process, 3 nodes, and a load balancer. (7 total vms) Each vm is using a on the same physical disk.

The Vagrant box is the RHEL 7.1 CDK image.

Vagrantfile attached.

Version-Release number of selected component (if applicable): etcd-2.1.1-2.el7.x86_64


How reproducible: Very


Steps to Reproduce:

Perform openshift-ansible installation for an HA environment with this VM configuration.

Actual results:

atomic-openshift-master-api service makes a number of raw etcd requests on first startup which are very slow, ~2s in this case, and is still working through them when it runs into the 90s systemd service startup timeout, which kills the process and thus your install.

Example from logs:

Dec 08 12:15:45 master1.aos.example.com atomic-openshift-master-api[26581]: [2.316744162s] [2.316744162s] END
Dec 08 12:15:48 master1.aos.example.com atomic-openshift-master-api[26581]: I1208 12:15:48.242100   26581 trace.go:57] Trace "Update *api.ClusterPolicyBinding" (started 2015-12-08 12:15:45.757181437 -0500 EST):
Dec 08 12:15:48 master1.aos.example.com atomic-openshift-master-api[26581]: [2.484902084s] [2.484902084s] END
Dec 08 12:15:49 master1.aos.example.com atomic-openshift-master-api[26581]: I1208 12:15:49.861606   26581 trace.go:57] Trace "Update *api.ClusterPolicyBinding" (started 2015-12-08 12:15:48.246875312 -0500 EST):
Dec 08 12:15:49 master1.aos.example.com atomic-openshift-master-api[26581]: [1.614713007s] [1.614713007s] END
Dec 08 12:15:52 master1.aos.example.com atomic-openshift-master-api[26581]: I1208 12:15:52.579077   26581 trace.go:57] Trace "Update *api.ClusterPolicyBinding" (started 2015-12-08 12:15:49.866518612 -0500 EST):
Dec 08 12:15:52 master1.aos.example.com atomic-openshift-master-api[26581]: [2.712543775s] [2.712543775s] END
Dec 08 12:15:55 master1.aos.example.com atomic-openshift-master-api[26581]: I1208 12:15:55.054774   26581 trace.go:57] Trace "Update *api.ClusterPolicyBinding" (started 2015-12-08 12:15:52.583697062 -0500 EST):
Dec 08 12:15:55 master1.aos.example.com atomic-openshift-master-api[26581]: [2.471049882s] [2.471049882s] END
Dec 08 12:15:56 master1.aos.example.com systemd[1]: atomic-openshift-master-api.service start operation timed out. Terminating.
Dec 08 12:15:56 master1.aos.example.com systemd[1]: atomic-openshift-master-api.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 08 12:15:56 master1.aos.example.com systemd[1]: Failed to start Atomic OpenShift Master API.
Dec 08 12:15:56 master1.aos.example.com systemd[1]: Unit atomic-openshift-master-api.service entered failed state.
Dec 08 12:15:56 master1.aos.example.com systemd[1]: atomic-openshift-master-api.service failed.


The problem is reproducible with raw etcd curl commands on the affected vms:

Ping reports .3ms or less between the vms.

Write:

(root@master1 ~) $ time curl --cacert /etc/etcd/ca.crt --cert
/etc/etcd/peer.crt --key /etc/etcd/peer.key -H "Content-Type:
application/json" https://192.168.133.10:2379/v2/keys/hello -XPUT -d
value="world 2"
{"action":"set","node":{"key":"/hello","value":"","modifiedIndex":78,"createdIndex":78},"prevNode":{"key":"/hello","value":"","modifiedIndex":77,"createdIndex":77}}
curl --cacert /etc/etcd/ca.crt --cert /etc/etcd/peer.crt --key  -H
-XPUT -d  0.03s user 0.02s system 3% cpu 1.509 total

Read:

(root@master1 ~) $ time curl --cacert /etc/etcd/ca.crt --cert
/etc/etcd/peer.crt --key /etc/etcd/peer.key -H "Content-Type:
application/json" https://192.168.133.10:2379/v2/keys/hello
{"action":"get","node":{"key":"/hello","value":"","modifiedIndex":79,"createdIndex":79}}
curl --cacert /etc/etcd/ca.crt --cert /etc/etcd/peer.crt --key  -H
0.03s user 0.03s system 84% cpu 0.064 total

Quorum read:

(root@master1 ~) $ time curl --cacert /etc/etcd/ca.crt --cert
/etc/etcd/peer.crt --key /etc/etcd/peer.key -H "Content-Type:
application/json"
"https://192.168.133.10:2379/v2/keys/hello?quorum=true"
{"action":"get","node":{"key":"/hello","value":"","modifiedIndex":79,"createdIndex":79}}
curl --cacert /etc/etcd/ca.crt --cert /etc/etcd/peer.crt --key  -H
0.04s user 0.03s system 5% cpu 1.217 total


These commands were run on the same system as the URL we're hitting, if I point this to another node in the cluster the time jumps up to exactly the ~2s we're seeing in the service startup.


In case it's relevant disk read speeds within the guest:


$ hdparm -Tt /dev/vda2

/dev/vda2:
 Timing cached reads:   12540 MB in  2.00 seconds = 6273.90 MB/sec
 Timing buffered disk reads:  16 MB in  3.51 seconds =   4.56 MB/sec

And on the host: (same disk)

$ sudo hdparm -Tt /dev/sdb1

/dev/sdb1:
 Timing cached reads:   26900 MB in  2.00 seconds = 13463.35 MB/sec
 Timing buffered disk reads: 314 MB in  3.01 seconds = 104.39 MB/sec


etcd.conf the installer generates is also attached.


NOTE: I am not sure if this is a bug or just poor vm configuration. We do have a report of almost exactly the same behavior when running etcd in 3 docker containers on same host. (via Jordan Liggitt)

Comment 1 Devan Goodwin 2015-12-11 15:35:35 UTC
Created attachment 1104732 [details]
etcd.conf from one of the affected cluster members

Comment 2 Timothy St. Clair 2015-12-11 15:51:03 UTC
This seems odd to me in leiu of - https://bugzilla.redhat.com/show_bug.cgi?id=1250707 

@rrati could you do a quick verification on our cluster. 

Also we've verified this numerous times using etcdctl - 

# etcdctl -C https://ose3-master1.ansible.test.example.com:2379,https://ose3-master2.ansible.test.example.com:2379,https://ose3-master3.ansible.test.example.com:2379 --ca-file /etc/etcd/ca.crt --cert-file=/etc/openshift/master/master.etcd-client.crt --key-file=/etc/openshift/master/master.etcd-client.key cluster-health 

# etcdctl -C https://ose3-master1.ansible.test.example.com:2379,https://ose3-master2.ansible.test.example.com:2379,https://ose3-master3.ansible.test.example.com:2379 --ca-file /etc/etcd/ca.crt --cert-file=/etc/openshift/master/master.etcd-client.crt --key-file=/etc/openshift/master/master.etcd-client.key member list
 
+ getting & setting keys, etc.

Comment 5 Andy Goldstein 2016-06-24 13:50:37 UTC
Is this still an issue?