Description of problem:
we see a lot of timeouts in ETCD when the apiservice is started. The ETCD is healthy and stable, when the api and controller services are down.
However, when we started the api service on one node, the ETCD member immediately went to not healthy status and shows "context deadline exceeded".
Also we see that ETCD starts to change the leader very often.
The whole host is becoming unstable, no ssh commands can be executed, also ping requests are being lost (not rejected).
Version-Release number of selected component (if applicable):
OpenShift Container Platform - atomic-openshift-3.11.88-1.git.0.47f4e98.el7.x86_64
Steps to Reproduce:
will attach the logs in private comment