Bug 1330372

Summary: [RFE] Reference for log files
Product: Red Hat Ceph Storage Reporter: Shinobu KINJO <skinjo>
Component: DocumentationAssignee: ceph-docs <ceph-docs>
Status: CLOSED NOTABUG QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.3.2CC: asriram, kdreyer, khartsoe, nlevine, sweil
Target Milestone: rcKeywords: FutureFeature
Target Release: 3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-09 23:16:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Shinobu KINJO 2016-04-26 04:59:02 UTC
[REQUEST]

Creation of reference of messages in log MON, OSD, MDS, RGW, Client.


[SUMMARY]

Since there is no reference regarding to reference log files, 
it's pretty much difficult to find out, or understand what's 
going on inside the Ceph cluster.

If they were to be able to understand what does any message 
means in the log file, it would be much help Red Hat to redu-
ce our work load. It's because they could resolve some easy 
issue - meaning that a number of case would be reduced.


[MOTIVATION]
 1. Customers could find solution for easy problems by thems-
    elves.

 2. Enable them to do analysis in the 1st stage.
   - This also enables us to reduce workload.

 3. It prevents them from causing further problems.
   - Because to understand what's going on would stop them do-
     ing further operation with ambiguous way.

 4. They would love RHCS.


[EXAMPLE]

There is a log like this:
 * This case is from OSD
 * The following log is modified a bit.

osd.1.log:2016-04-25 22:32:45.205289 7fdb23e8d700  0 \
(1)osd.1 \
(2)13 send_incremental_map \
(3)12 -> \
(4)13 to \
(5)0x7fdb41b87600 \
(6)172.16.0.4:6801/17629

As you can imagine, it's quite difficult for customers to und-
erstand what this line, each world means unless they would go 
to git, and clone then read code bases of the Ceph cluster.

(1) osd.1                 // OSD Id
(2) 13                    // A new epoch 
(3) 12                    // An old epoch
(4) 13                    // A new epoch
(5) 0x7fdb41b87600        // Address of connection
(6) 172.16.0.4:6801/17629 // Peer Address

If sysadmins could understand not whole thing but a bit more 
actual meanings of each world, it would be much help to them.

And we would be able to work with them more interactively.

Any though about this REF would be applicated.