Bug 1199203

Summary: etcd does not log to the journal
Product: Red Hat Enterprise Linux 7 Reporter: Scott Dodson <sdodson>
Component: etcdAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED NEXTRELEASE QA Contact: Guohua Ouyang <gouyang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: ekin.meroglu, eparis, gouyang, sdodson
Target Milestone: rcKeywords: Extras
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-03 06:23:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Dodson 2015-03-05 16:00:49 UTC
Description of problem:
The systemd service suppresses StandardOutput with a note that etcd logs directly to the journal, however I don't see any signs of it doing so.

Version-Release number of selected component (if applicable):
etcd-0.4.6-0.12.el7.x86_64

How reproducible:
100%

My config :

# grep -v \# /etc/etcd/etcd.conf 
addr = "openshift-minion-1:4001"
bind_addr = "0.0.0.0:4001"
data_dir = "/var/lib/etcd/default.etcd/"
peers = [ "openshift-minion-2:7001","openshift-master:7001"]
max_retry_attempts = 99
name = "openshift-minion-1"
snapshot = true

[peer]
addr = "openshift-minion-1:7001"
bind_addr = "0.0.0.0:7001"

This is what I have in the journal :
# journalctl -u etcd
-- Logs begin at Wed 2015-03-04 13:00:54 EST, end at Thu 2015-03-05 10:58:11 EST. --
Mar 04 13:36:13 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 04 13:36:13 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 04 13:36:17 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=1/FAILURE
Mar 04 13:36:17 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.

Comment 2 Jan Chaloupka 2015-03-09 15:49:20 UTC
Hi Scott,

looks like the service did not started at all so etcd did not get a change to log anything. Can you run etcd with options set instead of using configuration file?

# etcd -addr openshift-minion-1:4001 -bind-addr 0.0.0.0:4001 ...

What does it tell you?

Comment 3 Guohua Ouyang 2015-03-11 08:57:40 UTC
Could not reproduce this problem with etcd-0.4.6-0.13.el7.x86_64.

# systemctl status etcd
etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; disabled)
   Active: active (running) since Wed 2015-03-11 15:44:34 CST; 1h 9min ago
 Main PID: 22447 (etcd)
   CGroup: /system.slice/etcd.service
           └─22447 /usr/bin/etcd


# journalctl -u etcd
-- Logs begin at Fri 2015-03-06 13:03:19 CST, end at Wed 2015-03-11 16:50:01 CST. --
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com systemd[1]: Starting Etcd Server...
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com systemd[1]: Started Etcd Server.
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: default is starting a new cluster
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: etcd server [name default, listen on :4001, advertised url http://127.0.0.1:4001]
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: peer server [name default, listen on :7001, advertised url http://127.0.0.1:7001]
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: default starting in peer mode
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: default: state changed from 'initialized' to 'follower'.
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: default: state changed from 'follower' to 'leader'.
Mar 06 14:54:47 dhcp-9-220.nay.redhat.com etcd[11443]: default: leader changed from '' to 'default'.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: Stopping Etcd Server...
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: Stopped Etcd Server.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: Unit etcd.service entered failed state.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: Starting Etcd Server...
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com systemd[1]: Started Etcd Server.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: Peer URLs in log:  / default (http://127.0.0.1:7001)
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: etcd server [name default, listen on :4001, advertised url http://127.0.0.1:4001]
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: peer server [name default, listen on :7001, advertised url http://127.0.0.1:7001]
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: default starting in peer mode
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: default: state changed from 'initialized' to 'follower'.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: default: state changed from 'follower' to 'candidate'.
Mar 06 14:54:50 dhcp-9-220.nay.redhat.com etcd[11694]: default: state changed from 'candidate' to 'leader'.

Comment 4 Scott Dodson 2015-03-11 13:39:40 UTC
(In reply to Jan Chaloupka from comment #2)
> Hi Scott,
> 
> looks like the service did not started at all so etcd did not get a change
> to log anything. Can you run etcd with options set instead of using
> configuration file?
> 
> # etcd -addr openshift-minion-1:4001 -bind-addr 0.0.0.0:4001 ...
> 
> What does it tell you?

I've re-created the environment with the same configs adding verbose = true and very_verbose = true. etcd is running but I still get nothing in the journal.

[root@openshift-minion-1 ~]# systemctl status etcd
etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; disabled)
   Active: active (running) since Wed 2015-03-11 09:34:58 EDT; 1min 12s ago
 Main PID: 2284 (etcd)
   CGroup: /system.slice/etcd.service
           └─2284 /usr/bin/etcd

Mar 11 09:34:58 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Started Etcd Server.
[root@openshift-minion-1 ~]# journalctl -u etcd
-- Logs begin at Wed 2015-03-11 09:01:07 EDT, end at Wed 2015-03-11 09:34:58 EDT. --
Mar 11 09:08:06 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:08:06 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 11 09:13:00 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=1/FAILURE
Mar 11 09:13:00 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.
Mar 11 09:18:35 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:18:35 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 11 09:18:49 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 11 09:25:07 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=1/FAILURE
Mar 11 09:25:07 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.
Mar 11 09:28:17 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:28:17 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 11 09:34:35 openshift-minion-1 systemd[1]: Stopping Etcd Server...
Mar 11 09:34:35 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 11 09:34:35 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.
Mar 11 09:34:35 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:34:35 openshift-minion-1 systemd[1]: Started Etcd Server.
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Stopping Etcd Server...
Mar 11 09:34:58 openshift-minion-1 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Unit etcd.service entered failed state.
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Starting Etcd Server...
Mar 11 09:34:58 openshift-minion-1 systemd[1]: Started Etcd Server.

I've updated to the latest build in brew :
[root@openshift-minion-1 ~]# rpm -q etcd
etcd-0.4.6-0.13.el7.x86_64

When I run it from the command line (removed verbose logging) :
[root@openshift-minion-1 ~]# etcd
[etcd] Mar 11 09:37:06.515 INFO      | openshift-minion-1: peer added: 'openshift-master'
[etcd] Mar 11 09:37:06.566 INFO      | openshift-minion-1: peer added: 'openshift-minion-2'
[etcd] Mar 11 09:37:06.595 INFO      | Peer URLs in log:  / openshift-minion-1 (http://openshift-minion-2:7001,http://openshift-master:7001,http://openshift-minion-1:7001)
[etcd] Mar 11 09:37:06.597 INFO      | Send Join Request to http://openshift-minion-2:7001/join
[etcd] Mar 11 09:37:06.665 INFO      | openshift-minion-1 joined the cluster via peer openshift-minion-2:7001
[etcd] Mar 11 09:37:06.666 INFO      | etcd server [name openshift-minion-1, listen on 0.0.0.0:4001, advertised url http://openshift-minion-1:4001]
[etcd] Mar 11 09:37:06.666 INFO      | peer server [name openshift-minion-1, listen on 0.0.0.0:7001, advertised url http://openshift-minion-1:7001]
[etcd] Mar 11 09:37:06.666 INFO      | openshift-minion-1 starting in peer mode
[etcd] Mar 11 09:37:06.666 INFO      | openshift-minion-1: state changed from 'initialized' to 'follower'.

Comment 5 Guohua Ouyang 2015-03-23 04:32:49 UTC
I saw this problem when the etcd service is running but in an unexpected state, no logs recorded.

Could you check that whether etcdctl or curl works with the etcd server?(set/rm some keys)

Comment 6 Scott Dodson 2015-03-23 15:06:04 UTC
[root@ose3-master ~]# curl -L http://`hostname`:4001/v2/keys/foo -XPUT -d value=bar
{"action":"set","node":{"key":"/foo","value":"bar","modifiedIndex":9,"createdIndex":9},"prevNode":{"key":"/foo","value":"bar","modifiedIndex":8,"createdIndex":8}}

[root@ose3-master ~]# curl -L http://`hostname`:4001/v2/keys/foo
{"action":"get","node":{"key":"/foo","value":"bar","modifiedIndex":9,"createdIndex":9}}

Additionally, I can run OpenShift 3 against this environment so I'm assuming it's working properly.

Comment 7 Scott Dodson 2015-03-23 16:03:56 UTC
To be clear, I can reproduce this as simply as :

1) Provision OS1 internal RHEL7 host using rhel-guest-image-7.0-20140618.1
2) rpm -ivh http://download.devel.redhat.com/brewroot/packages/etcd/0.4.6/0.13.el7/x86_64/etcd-0.4.6-0.13.el7.x86_64.rpm
3) systemctl start etcd

At which point the only output to the journal is 

Mar 23 11:57:52 etcd.os1.phx2.redhat.com systemd[1]: Started Etcd Server.

Adding verbose=true and very_verbose=true to /etc/etcd/etcd.conf has no effect on the journal, however if I run etcd from the console I get copious verbose logging.

Comment 8 Guohua Ouyang 2015-03-23 23:42:45 UTC
I could reproduce this problem on RHEL, atomic and guest-image, my previous comment #3 is outdated logs.