Description of problem: The api /api/v2/cluster/{cluster-fsid}/server should list all participating severs in the cluster be it mon or osd node Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Create a ceph cluster using 3 mons and one OSD node (3 disks as OSD) 2. Access the api /api/v2/cluster/{cluster-fsid}/server 3. Actual results: The result does not list all the servers participating in the cluster Expected results: The api should list all the servers participating in the cluster Additional info:
An observation, that only mon nodes in the listing have ceph_version and for osd nodes its null. Similarly, the FQDN name for mon node shows in the format say "dhcp47-98.lab.eng.blr.redhat.com", whereas for osd nodes its in the format "dhcp47-95". These informations should be uniform across the nodes.
I can't reproduce this. Would you please indicate which versions are exhibiting this behavior? also please provide the output of sudo ceph -s and sudo ceph osd tree
This certainly was not with latest calamari builds. As we are not yet able to successfully create a cluster with new builds, once its working I would try to simulate the issue and update the details, if face the issue again.
I created a new cluster with latest ceph-2.0 and calamari builds from http://puddle.ceph.redhat.com/puddles/ceph/2/2016-04-08.1/CEPH-2.repo http://puddle.ceph.redhat.com/puddles/rhscon/2/2016-04-10.1/RHSCON-2.repo Configured first mon using command curl -d "{\"calamari\": true, \"host\": \"dhcp46-204.lab.eng.blr.redhat.com\", \"fsid\": \"deedcb4c-a67a-4997-93a6-92149ad2622a\", \"interface\": \"eth0\", \"monitor_secret\": \"AQA7P8dWAAAAABAAH/tbiZQn/40Z8pr959UmEA==\", \"cluster_network\": \"10.70.44.0/22\", \"public_network\": \"10.70.44.0/22\", \"redhat_storage\": false}" http://dhcp47-73.lab.eng.blr.redhat.com:8181/api/mon/configure/ and then configured two more mons for the same cluster using the command curl -d "{\"calamari\": true, \"host\": \"dhcp46-204.lab.eng.blr.redhat.com\", \"fsid\": \"deedcb4c-a67a-4997-93a6-92149ad2622a\", \"interface\": \"eth0\", \"monitor_secret\": \"AQA7P8dWAAAAABAAH/tbiZQn/40Z8pr959UmEA==\", \"cluster_network\": \"10.70.44.0/22\", \"public_network\": \"10.70.44.0/22\", \"redhat_storage\": false, \"monitors\": [{\"host\": \"dhcp47-55.lab.eng.blr.redhat.com\", \"interface\": \"eth0\"}, {\"host\": \"dhcp46-181.lab.eng.blr.redhat.com\", \"interface\": \"eth0\"}]}" http://dhcp47-73.lab.eng.blr.redhat.com:8181/api/mon/configure/ But now if I access http://<mon node ip>:8002/api/v2/cluster/<fsid>/server, it lists only the referred mon node in the URL. My understanding is if there are 3 mons nodes in the cluster, all three should be listed regardless of which mon I used for accessing the URL /api/v2/cluster/{fsid}/server. Once OSD configure works out, I woul try configuring some OSDs in the same setup and update. If you want to have a look to the setup, details 10.70.47.73 (root/redhat) - skyring server 10.70.46.204, 10.70.47.55, 10.70.46.181 (root/redhat) - mon nodes
Do you still have doubts about this?
With latest builds as well, if I have two mons and two osd nodes in a cluster, while accessing /api/v2/cluster/{fsid}/server using one mon calamari, lists only three nodes. The mon which we are using to invoke calamari api and two osd nodes. My understanding is that irrespective of which mon/calamari I invoke api from, it should list all the four nodes. One such working setup is available now at dhcp47-100.lab.eng.blr.redhat.com (MON+Calamari) dhcp47-138.lab.eng.blr.redhat.com (MON) dhcp46-161.lab.eng.blr.redhat.com (OSD) dhcp47-38.lab.eng.blr.redhat.com (OSD) credentials root/redhat The output of /api/v2/cluster/{fsid}/server from Calamari node (dhcp47-100.lab.eng.blr.redhat.com) shows below output [ { "fqdn": "dhcp46-161", "hostname": "dhcp46-161", "services": [ { "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", "type": "osd", "id": "0", "running": true } ], "frontend_addr": "10.70.46.161", "backend_addr": "10.70.46.161", "frontend_iface": null, "backend_iface": null, "managed": false, "last_contact": null, "boot_time": null, "ceph_version": null }, { "fqdn": "dhcp47-100.lab.eng.blr.redhat.com", "hostname": "dhcp47-100.lab.eng.blr.redhat.com", "services": [ { "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", "type": "mon", "id": "dhcp47-100", "running": true } ], "frontend_addr": "10.70.47.100", "backend_addr": null, "frontend_iface": null, "backend_iface": null, "managed": true, "last_contact": "2016-04-21T20:14:39.144849+00:00", "boot_time": "2016-04-21T17:07:45+00:00", "ceph_version": "10.1.1-1.el7cp" }, { "fqdn": "dhcp47-38", "hostname": "dhcp47-38", "services": [ { "fsid": "deedcb4c-a67a-4997-93a6-92149ad2622a", "type": "osd", "id": "1", "running": true } ], "frontend_addr": "10.70.47.38", "backend_addr": "10.70.47.38", "frontend_iface": null, "backend_iface": null, "managed": false, "last_contact": null, "boot_time": null, "ceph_version": null } ] The other mon node (dhcp47-138.lab.eng.blr.redhat.com) is not listed as part of this.
The api /api/v2/cluster/{cluster-fsid}/server lists all the mons and osds but with incorrect values for few fields. 1) The mon details for non-leader mon does not show the right values for: "boot_time": "1970-01-01T00:00:00+00:00" (expected: actual boot time) "ceph_version": null (expected: should show proper ceph version) "fqdn": cephqe4 (expected: should show actual fqdn) 2) The osd details for all osd hosts show following: "last_contact": null, "boot_time": null, "ceph_version": null Tested on: calamari-server-1.4.0-0.12.rc15.el7cp.x86_64 ceph version 10.2.1-13.el7cp
Harish would you please file a sepearate BZ for the issue you found in comment 9? Those values are hard to get and I don't think that storage console is blocked by them. Any way we can decide that in the other BZ.
(In reply to Gregory Meno from comment #11) > Harish would you please file a sepearate BZ for the issue you found in > comment 9? Those values are hard to get and I don't think that storage > console is blocked by them. Any way we can decide that in the other BZ. Thanks Gregory. I've created https://bugzilla.redhat.com/show_bug.cgi?id=1348529 for the issue described in comment 9. Moving this defect to verified state as original issue is fixed. Tested on: calamari-server-1.4.0-0.12.rc15.el7cp.x86_64 ceph version 10.2.1-13.el7cp
Works fine...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1755.html