Bug 1319892 - The api /api/v2/cluster/{cluster-fsid}/server should list all participating severs in the cluster be it mon or osd node
Summary: The api /api/v2/cluster/{cluster-fsid}/server should list all participating s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Calamari
Version: 2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 2.0
Assignee: Christina Meno
QA Contact: Harish NV Rao
URL:
Whiteboard:
Depends On:
Blocks: 1291304
TreeView+ depends on / blocked
 
Reported: 2016-03-21 18:28 UTC by Shubhendu Tripathi
Modified: 2016-08-23 19:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 19:35:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1755 normal SHIPPED_LIVE Red Hat Ceph Storage 2.0 bug fix and enhancement update 2016-08-23 23:23:52 UTC

Description Shubhendu Tripathi 2016-03-21 18:28:30 UTC
Description of problem:
The api /api/v2/cluster/{cluster-fsid}/server should list all participating severs in the cluster be it mon or osd node

Version-Release number of selected component (if applicable):


How reproducible:
Always


Steps to Reproduce:
1. Create a ceph cluster using 3 mons and one OSD node (3 disks as OSD)
2. Access the api /api/v2/cluster/{cluster-fsid}/server
3.

Actual results:
The result does not list all the servers participating in the cluster

Expected results:
The api should list all the servers participating in the cluster

Additional info:

Comment 2 Shubhendu Tripathi 2016-04-01 20:27:50 UTC
An observation, that only mon nodes in the listing have ceph_version and for osd nodes its null.

Similarly, the FQDN name for mon node shows in the format say "dhcp47-98.lab.eng.blr.redhat.com", whereas for osd nodes its in the format "dhcp47-95".

These informations should be uniform across the nodes.

Comment 3 Christina Meno 2016-04-11 22:54:23 UTC
I can't reproduce this.

Would you please indicate which versions are exhibiting this behavior?

also please provide the output of 
sudo ceph -s
and
sudo ceph osd tree

Comment 4 Shubhendu Tripathi 2016-04-12 04:06:43 UTC
This certainly was not with latest calamari builds.
As we are not yet able to successfully create a cluster with new builds, once its working I would try to simulate the issue and update the details, if face the issue again.

Comment 5 Shubhendu Tripathi 2016-04-12 08:24:00 UTC
I created a new cluster with latest ceph-2.0 and calamari builds from 

http://puddle.ceph.redhat.com/puddles/ceph/2/2016-04-08.1/CEPH-2.repo
http://puddle.ceph.redhat.com/puddles/rhscon/2/2016-04-10.1/RHSCON-2.repo

Configured first mon using command


curl -d "{\"calamari\": true, \"host\": \"dhcp46-204.lab.eng.blr.redhat.com\", \"fsid\": \"deedcb4c-a67a-4997-93a6-92149ad2622a\", \"interface\": \"eth0\", \"monitor_secret\": \"AQA7P8dWAAAAABAAH/tbiZQn/40Z8pr959UmEA==\", \"cluster_network\": \"10.70.44.0/22\", \"public_network\": \"10.70.44.0/22\", \"redhat_storage\": false}" http://dhcp47-73.lab.eng.blr.redhat.com:8181/api/mon/configure/


and then configured two more mons for the same cluster using the command


curl -d "{\"calamari\": true, \"host\": \"dhcp46-204.lab.eng.blr.redhat.com\", \"fsid\": \"deedcb4c-a67a-4997-93a6-92149ad2622a\", \"interface\": \"eth0\", \"monitor_secret\": \"AQA7P8dWAAAAABAAH/tbiZQn/40Z8pr959UmEA==\", \"cluster_network\": \"10.70.44.0/22\", \"public_network\": \"10.70.44.0/22\", \"redhat_storage\": false, \"monitors\": [{\"host\": \"dhcp47-55.lab.eng.blr.redhat.com\", \"interface\": \"eth0\"}, {\"host\": \"dhcp46-181.lab.eng.blr.redhat.com\", \"interface\": \"eth0\"}]}" http://dhcp47-73.lab.eng.blr.redhat.com:8181/api/mon/configure/


But now if I access http://<mon node ip>:8002/api/v2/cluster/<fsid>/server, it lists only the referred mon node in the URL.
My understanding is if there are 3 mons nodes in the cluster, all three should be listed regardless of which mon I used for accessing the URL /api/v2/cluster/{fsid}/server.


Once OSD configure works out, I woul try configuring some OSDs in the same setup and update.

If you want to have a look to the setup, details

10.70.47.73 (root/redhat) - skyring server
10.70.46.204, 10.70.47.55, 10.70.46.181 (root/redhat) - mon nodes

Comment 7 Christina Meno 2016-04-21 20:08:37 UTC
Do you still have doubts about this?

Comment 8 Shubhendu Tripathi 2016-04-21 20:34:36 UTC
With latest builds as well, if I have two mons and two osd nodes in a cluster, while accessing /api/v2/cluster/{fsid}/server using one mon calamari, lists only three nodes. The mon which we are using to invoke calamari api and two osd nodes.

My understanding is that irrespective of which mon/calamari I invoke api from, it should list all the four nodes.

One such working setup is available now at 

dhcp47-100.lab.eng.blr.redhat.com (MON+Calamari)
dhcp47-138.lab.eng.blr.redhat.com (MON)
dhcp46-161.lab.eng.blr.redhat.com (OSD)
dhcp47-38.lab.eng.blr.redhat.com (OSD)

credentials root/redhat

The output of /api/v2/cluster/{fsid}/server from Calamari node (dhcp47-100.lab.eng.blr.redhat.com) shows below output

[
    {
        "fqdn": "dhcp46-161", 
        "hostname": "dhcp46-161", 
        "services": [
            {
                "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", 
                "type": "osd", 
                "id": "0", 
                "running": true
            }
        ], 
        "frontend_addr": "10.70.46.161", 
        "backend_addr": "10.70.46.161", 
        "frontend_iface": null, 
        "backend_iface": null, 
        "managed": false, 
        "last_contact": null, 
        "boot_time": null, 
        "ceph_version": null
    }, 
    {
        "fqdn": "dhcp47-100.lab.eng.blr.redhat.com", 
        "hostname": "dhcp47-100.lab.eng.blr.redhat.com", 
        "services": [
            {
                "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", 
                "type": "mon", 
                "id": "dhcp47-100", 
                "running": true
            }
        ], 
        "frontend_addr": "10.70.47.100", 
        "backend_addr": null, 
        "frontend_iface": null, 
        "backend_iface": null, 
        "managed": true, 
        "last_contact": "2016-04-21T20:14:39.144849+00:00", 
        "boot_time": "2016-04-21T17:07:45+00:00", 
        "ceph_version": "10.1.1-1.el7cp"
    }, 
    {
        "fqdn": "dhcp47-38", 
        "hostname": "dhcp47-38", 
        "services": [
            {
                "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", 
                "type": "osd", 
                "id": "1", 
                "running": true
            }
        ], 
        "frontend_addr": "10.70.47.38", 
        "backend_addr": "10.70.47.38", 
        "frontend_iface": null, 
        "backend_iface": null, 
        "managed": false, 
        "last_contact": null, 
        "boot_time": null, 
        "ceph_version": null
    }
]


The other mon node (dhcp47-138.lab.eng.blr.redhat.com) is not listed as part of this.

Comment 9 Harish NV Rao 2016-06-14 12:32:57 UTC
The api /api/v2/cluster/{cluster-fsid}/server lists all the mons and osds but with incorrect values for few fields.

1) The mon details for non-leader mon does not show the right values for:
        "boot_time": "1970-01-01T00:00:00+00:00" (expected: actual boot time) 
        "ceph_version": null (expected: should show proper ceph version)
        "fqdn": cephqe4 (expected: should show actual fqdn)

2) The osd details for all osd hosts show following:
        "last_contact": null, 
        "boot_time": null, 
        "ceph_version": null

Tested on:
 calamari-server-1.4.0-0.12.rc15.el7cp.x86_64
 ceph version 10.2.1-13.el7cp

Comment 11 Christina Meno 2016-06-16 13:56:49 UTC
Harish would you please file a sepearate BZ for the issue you found in comment 9? Those values are hard to get and I don't think that storage console is blocked by them. Any way we can decide that in the other BZ.

Comment 12 Harish NV Rao 2016-06-21 11:58:25 UTC
(In reply to Gregory Meno from comment #11)
> Harish would you please file a sepearate BZ for the issue you found in
> comment 9? Those values are hard to get and I don't think that storage
> console is blocked by them. Any way we can decide that in the other BZ.

Thanks Gregory. I've created https://bugzilla.redhat.com/show_bug.cgi?id=1348529 for the issue described in comment 9. 

Moving this defect to verified state as original issue is fixed.

Tested on:
 calamari-server-1.4.0-0.12.rc15.el7cp.x86_64
 ceph version 10.2.1-13.el7cp

Comment 13 Shubhendu Tripathi 2016-07-04 11:18:40 UTC
Works fine...

Comment 15 errata-xmlrpc 2016-08-23 19:35:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html


Note You need to log in before you can comment on or make changes to this bug.