Bug 1315800

Summary: OSDs statuses
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Lubos Trilety <ltrilety>
Component: coreAssignee: Nishanth Thomas <nthomas>
Status: CLOSED CURRENTRELEASE QA Contact: sds-qe-bugs
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2CC: mkudlej, nthomas, sankarshan
Target Milestone: ---   
Target Release: 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhscon-ceph-0.0.23-1.el7scon.x86_64, rhscon-core-0.0.24-1.el7scon.x86_64, rhscon-ui-0.0.39-1.el7scon.noarch Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-19 05:34:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lubos Trilety 2016-03-08 15:43:39 UTC
Description of problem:
OSDs status is not updated properly. E.g. I removed disk from a host, however UI does not find it out.

in mongo db
> db.storage_logical_units.find().forEach(printjson)
...
{
	"_id" : ObjectId("56dee42e1147a0b5b90ad8d4"),
	"sluid" : BinData(0,"kTdJpxkaSwmxFfcGd4f9Cw=="),
	"name" : "osd.2",
	"type" : 1,
	"clusterid" : BinData(0,"/vAMdjn9SEKN+ZFooPbreQ=="),
	"nodeid" : BinData(0,"HUoZQVVqRYu97EXSLpYjYw=="),
	"storageid" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAA=="),
	"storagedeviceid" : BinData(0,"OpwOyydGS+ykTZ8BF9Yafg=="),
	"storagedevicesize" : NumberLong("10737418240"),
	"status" : 0,
	"options" : {
		"in" : "true",
		"up" : "true",
		"node" : "ltrilety-usm2-node2.os1.phx2.redhat.com",
		"publicip4" : "172.16.180.7",
		"clusterip4" : "172.16.180.7",
		"device" : "/dev/vdc",
		"fstype" : "xfs"
	},
	"storageprofile" : "general",
	"state" : "In",
	"almstatus" : 5,
	"almcount" : 0
}
...

on ceph side
# ceph osd stat --cluster testCluster
     osdmap e46: 6 osds: 5 up, 5 in; 6 remapped pgs


Version-Release number of selected component (if applicable):
rhscon-core-0.0.8-11.el7.x86_64
rhscon-ui-0.0.20-1.el7.noarch
rhscon-ceph-0.0.6-11.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Remove some disk from a host which is part of some cluster

Actual results:
There's no change on skyring server side. OSD state was not changed.

Expected results:
OSD state should change to down/error in DB and on UI.

Additional info:

Comment 3 Darshan 2016-05-05 10:01:03 UTC
This has been fixed as follows:

We listen to events emitted by calamari, calamari emits events when there is a change is osd state. Based on the event we update the status of osd in skyring DB, which in turn will be reflected in UI

Comment 4 Lubos Trilety 2016-08-02 15:36:12 UTC
Tested on:
rhscon-core-0.0.38-1.el7scon.x86_64
rhscon-ceph-0.0.38-1.el7scon.x86_64
rhscon-core-selinux-0.0.38-1.el7scon.noarch
rhscon-ui-0.0.51-1.el7scon.noarch

OSD state is changed properly, however it's not changed correctly on any dashboard. A new BZ 1359129 is created for that.