Description of problem: Customer is seeing a lot of these messages: Apr 4 04:51:05 Y33864 etcd: failed to send out heartbeat on time (exceeded the 500ms timeout for 509.305952ms) Apr 4 04:51:05 Y33864 etcd: server is likely overloaded Apr 4 04:51:05 Y33864 etcd: failed to send out heartbeat on time (exceeded the 500ms timeout for 509.330964ms) Apr 4 04:51:05 Y33864 etcd: server is likely overloaded Version-Release number of selected component (if applicable): atomic-openshift-3.5.5.31.24-1.git.0.ff74e0b.el7.x86_64 etcd-3.2.5-1.el7.x86_64 How reproducible: Frequently Actual results: No apparent issues other that the concerned message Expected results: No "failed to send out heartbeat" message Additional info: We checked metrics data and sysstat data and found nothing, so we would like to get an advice of what to check next.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days