Bug 1507590

Summary: Etcd daemon on Openshift masters fails sending heartbeats
Product: OpenShift Container Platform Reporter: Javier Ramirez <javier.ramirez>
Component: MasterAssignee: Stefan Schimanski <sttts>
Status: CLOSED NOTABUG QA Contact: Wang Haoran <haowang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.1CC: aos-bugs, asolanas, dmoessne, erich, hgomes, jeder, jokerman, mfojtik, mmariyan, mmccomas, sttts
Target Milestone: ---   
Target Release: 3.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-07 11:19:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Javier Ramirez 2017-10-30 16:12:00 UTC
Description of problem:

Customer is seeing a lot of these messages:

Apr  4 04:51:05 Y33864 etcd: failed to send out heartbeat on time (exceeded the 500ms timeout for 509.305952ms)
Apr  4 04:51:05 Y33864 etcd: server is likely overloaded
Apr  4 04:51:05 Y33864 etcd: failed to send out heartbeat on time (exceeded the 500ms timeout for 509.330964ms)
Apr  4 04:51:05 Y33864 etcd: server is likely overloaded

Version-Release number of selected component (if applicable):
atomic-openshift-3.5.5.31.24-1.git.0.ff74e0b.el7.x86_64
etcd-3.2.5-1.el7.x86_64

How reproducible:
Frequently

Actual results:
No apparent issues other that the concerned message


Expected results:
No "failed to send out heartbeat" message

Additional info:
We checked metrics data and sysstat data and found nothing, so we would like to get an advice of what to check next.

Comment 16 Red Hat Bugzilla 2023-09-18 00:12:51 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days