Bug 1463304

Summary: tendrl-node-monitoring - 'Error connecting to central store (etcd), trying again...'
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Lubos Trilety <ltrilety>
Component: node-monitoringAssignee: anmol babu <anbabu>
Status: CLOSED WONTFIX QA Contact: sds-qe-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3CC: mkarnik, nthomas, sankarshan
Target Milestone: alpha   
Target Release: 3-alpha   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 05:40:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lubos Trilety 2017-06-20 14:28:32 UTC
Description of problem:
If tendrl-node-monitoring does not succeed with connecting to etcd it goes to loop when it logs that it cannot connect repeatedly without much delay and without new lines.
Example of log:
# journalctl -u tendrl-node-monitoring
-- Logs begin at Út 2017-06-20 04:55:27 EDT, end at Út 2017-06-20 10:10:57 EDT. --
čen 20 04:55:32 localhost.localdomain systemd[1]: Started Daemon to manage tendrl node monitoring.
čen 20 04:55:32 localhost.localdomain systemd[1]: Starting Daemon to manage tendrl node monitoring...
čen 20 04:57:20 <hostname> tendrl-node-monitoring[527]: Creating namespace.node_monitoring from source tendrl.node_monitoringnamespace.node_monitoring created!Finding objects in namespace.node_monitorin
čen 20 04:57:20 <hostname> tendrl-node-monitoring[527]: connecting to central store (etcd), trying again...Error connecting to central store (etcd), trying again...Error connecting to central store (etc
čen 20 04:59:42 <hostname> tendrl-node-monitoring[527]: ror connecting to central store (etcd), trying again...Error connecting to central store (etcd), trying again...Error connecting to central store 
čen 20 04:59:42 <hostname> tendrl-node-monitoring[527]: ..Error connecting to central store
...

Version-Release number of selected component (if applicable):
tendrl-node-agent-3.0-alpha.10.el7scon.noarch
tendrl-commons-3.0-alpha.10.el7scon.noarch
tendrl-node-monitoring-3.0-alpha.6.el7scon.noarch

How reproducible:
100%

Steps to Reproduce:
1. Stop etcd on tendrl server
2. Start tendrl-node-monitoring on any machine
3. Wait till the node-monitoring logs 'Error connecting to central store (etcd), trying again...'
4. Wait a while and start etcd on tendrl server

Actual results:
tendrl-node-monitoring repeatedly logs 'Error connecting to central store (etcd), trying again...' without much delay and without any new line.
If we wait a long before we start etcd on server, tendrl-node-monitoring doesn't end to log the failure even when the etcd is running again. Probably because there's too much of those messages buffered. Because of that no new messages from tendrl-node-monitoring are logged for a long time.

Expected results:
tendrl-node-monitoring should periodically check if the connection is available. However there should be some sleep and probably just first note about that could be enough. The message should end with newline character.

Additional info:
Restart of the tendrl-node-monitoring fix the issue when etcd is running.

Comment 4 Shubhendu Tripathi 2018-11-19 05:40:29 UTC
This product is EOL now