1455509 – Document correct way to retrieve logs from ceph daemons running in container

Bug 1455509 - Document correct way to retrieve logs from ceph daemons running in container

Summary: Document correct way to retrieve logs from ceph daemons running in container

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Documentation
Sub Component:
Version:	2.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	2.3
Assignee:	Bara Ancincova
QA Contact:	Rachana Patel
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1437905
TreeView+	depends on / blocked

Reported:	2017-05-25 11:47 UTC by Boris Ranto
Modified:	2017-07-11 17:08 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-07-11 17:08:33 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Boris Ranto 2017-05-25 11:47:58 UTC

Description of problem:
Currently, we do not export /var/log/ceph to the containers and we remove them once they are stopped so we have no way of getting the logs after e.g. a osd daemon crashes. The logs are just lost and we have no idea why the daemon in the container stopped.

Version-Release number of selected component (if applicable):
ceph-ansible-2.2.6-1.el7scon.noarch

How reproducible:
Always

Steps to Reproduce:
1. Look for the logs in /var/log/ceph in the host
2.
3.

Actual results:
No logs.

Expected results:
The logs are exported from the host.

Additional info:

Comment 2 seb 2017-05-29 13:39:45 UTC

This makes sense and would require changes in both ceph-ansible and ceph-docker because even if we bindmount /var/log/ceph daemons won't log anything as they are configured to log to stderr.

Comment 3 Christina Meno 2017-05-30 17:29:26 UTC

I think this could be seen to block GA of containers on the basis that "Basic functionality of a new or legacy feature not working."

Logs are an important element in supporting any component in production.
If we can document an acceptable workaround I would be happy.

I'm worried that the best we can do without change is:
set'DEBUG=stayalive'
re-invoke docker run  #hoping that the condition repeats
then collect the logs with sudo docker exec -i -t $HOSTNAME /bin/bash"
run journalctl

This seems like it will add to the burden of supporting the product. What do you think?

Comment 4 seb 2017-05-31 09:14:27 UTC

I'm tempted to say yes and I generally agree, users should keep a consistent experience. However, if we really do this, we can hardly implement any log rotation, which at some point will cause issues.

That's why I think we should rely on Docker logging capabilities instead. In Docker the logging driver is journald, so journald is responsible for collecting and storing logs. So basically just use e.g for a monitor "journalctl -u ceph-mon" to get the full history of all the logs.

In the end, I think this is the best approach. Although, this will require some documentation on how to access a log history from a particular container.

What do you think?

Comment 5 John Poelstra 2017-05-31 15:16:48 UTC

discussed at program meeting... need doc approval from Gregory and should have by the end of today which will resolve this bug

Comment 6 Christina Meno 2017-06-01 16:25:58 UTC

Ok

Comment 8 seb 2017-06-02 09:39:29 UTC

Just added a couple of comments.

Comment 20 seb 2017-06-12 13:39:49 UTC

The only difference between the logs from "journalctl" and "daemon logs inside the container" is that journalctl shows the output of the container entrypoint AND the daemon logs.

Normally you should see the same log once the daemon start (inside the container and from journalctl).

I tend to disagree with the statement "we loose the logs from journalctl if the container dies/restart" this is not what I observed. Even if a new container is created after each restart, the unit file remains the same, thus the entry log from journalctl will remain.

Running journalctl -u ceph-osd gives you ALL the container processes/logs that once existed. If you look at the date, you are comparing the creation of the first container (journalctl source) May 19 16:58:49 with your last logs of your last container: 2017-06-07. This is irrelevant.

Run journalctl -uf ceph-osd and you will see the last logs.

Comment 22 Rachana Patel 2017-06-12 16:06:32 UTC

Based on Comment #21,  verified doc text.

Looks good to me.
one suggestion we for all commands in section 'Viewing Log Files of Containerized Ceph Daemons'

we can drop 'service' writted at end
e.g.
journalctl -u ceph-[daemon]@[ID].service 

can be

journalctl -u ceph-[daemon]@[ID]


Both commands are valid so if you dont want to change and move it to verified then no issues.

Comment 24 Rachana Patel 2017-07-03 13:20:32 UTC

lgtm

Note You need to log in before you can comment on or make changes to this bug.