Bug 1312265

Summary: A new disk is not added automatically
Product: [Red Hat Storage] Red Hat Storage Console Reporter: Lubos Trilety <ltrilety>
Component: agentAssignee: Darshan <dnarayan>
Status: CLOSED ERRATA QA Contact: Lubos Trilety <ltrilety>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2CC: dnarayan, japplewh, ltrilety, mkudlej, nthomas, sankarshan
Target Milestone: ---   
Target Release: 2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhscon-core-0.0.26-1.el7scon.x86_64 rhscon-ceph-0.0.26-1.el7scon.x86_64 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:47:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1318557, 1349786    
Bug Blocks: 1344192    

Description Lubos Trilety 2016-02-26 09:25:39 UTC
Description of problem:
When a new disk is added to some host monitored by the console it is not counted automatically as it should.

Version-Release number of selected component (if applicable):
rhscon-agent-0.0.3-1.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Accept a host on UI
2. Add some disk to a host

Actual results:
Nothing happens, system didn't notice that a new disk is added

Expected results:
A disk is calculated properly

Additional info:
Workaround: Restart systemd-skynetd service on all accepted hosts. After that it works.

Comment 10 Lubos Trilety 2016-05-30 16:47:43 UTC
It doesn't work again on rhscon-core-0.0.21-1.el7scon.x86_64 and even suggested workaround doesn't help.

Comment 11 Lubos Trilety 2016-06-01 14:59:17 UTC
(In reply to Lubos Trilety from comment #10)
> It doesn't work again on rhscon-core-0.0.21-1.el7scon.x86_64 and even
> suggested workaround doesn't help.

If I add a disk to a node which is not part of a cluster it doesn't work either. In both cases a storage_nodes collection in mongodb is not updated.

Comment 12 Lubos Trilety 2016-06-02 08:59:57 UTC
After some investigation, we found that:
1. storaged service is not started after the node reboot
2. if there is an ssd disk available that has to be used as journal disk while adding new osds(as of now we need 2 disks to create 1 osd).

Comment 13 Darshan 2016-06-20 12:18:58 UTC
Both the issues mentioned in comment 12 have been taken care as part of this fix

Comment 15 Lubos Trilety 2016-07-15 15:37:31 UTC
Tested on:
rhscon-ceph-0.0.31-1.el7scon.x86_64
rhscon-core-selinux-0.0.32-1.el7scon.noarch
rhscon-ui-0.0.46-1.el7scon.noarch
rhscon-core-0.0.32-1.el7scon.x86_64

Still more than one disk has to be added for cluster to be expanded. The other one is not used for journal though.

Comment 16 Lubos Trilety 2016-07-15 16:24:25 UTC
(In reply to Lubos Trilety from comment #15)
> Tested on:
> rhscon-ceph-0.0.31-1.el7scon.x86_64
> rhscon-core-selinux-0.0.32-1.el7scon.noarch
> rhscon-ui-0.0.46-1.el7scon.noarch
> rhscon-core-0.0.32-1.el7scon.x86_64
> 
> Still more than one disk has to be added for cluster to be expanded. The
> other one is not used for journal though.

Correction the second one is still used as journal. The problem could be in the sad detection. During cluster creation a disk was detected as ssd. However later when I check nodes in db it is saying ssd: false.

Comment 17 Darshan 2016-07-18 07:13:33 UTC
(In reply to Lubos Trilety from comment #16)
> (In reply to Lubos Trilety from comment #15)
> > Tested on:
> > rhscon-ceph-0.0.31-1.el7scon.x86_64
> > rhscon-core-selinux-0.0.32-1.el7scon.noarch
> > rhscon-ui-0.0.46-1.el7scon.noarch
> > rhscon-core-0.0.32-1.el7scon.x86_64
> > 
> > Still more than one disk has to be added for cluster to be expanded. The
> > other one is not used for journal though.
> 
> Correction the second one is still used as journal. The problem could be in
> the sad detection. During cluster creation a disk was detected as ssd.
> However later when I check nodes in db it is saying ssd: false.

If expansion of cluster is working fine and if at all you see an issue with SSD detection, can you raise a separate bug for that ?

Comment 18 Darshan 2016-07-18 10:18:45 UTC
Could you please check the following values:

1. After node is accepted successfully, check if the nodes collection in DB has marked the SSD Disks correctly.

2. After successful cluster creation, Check the slus(OSDs) collection in the DB.
For each Slu there will be options, under that there will be a map called "journal" of the form :
{
"journal Disk": name of journal disk,
"ssd" : whether this is ssd or not(true/false),
"size": The size of the journal disk,
"available": The available size of this journal disk while this disk was created.
}

In above map check if "ssd" is TRUE for all the OSDs, and also check if the minimum value of "available" field among all OSD journal entries is greater than the journal size.

If ssd is True and "available"(least value for that journal disk) > journal size then new disk addition will use this ssd disk as journal.

Comment 19 Darshan 2016-07-18 14:36:30 UTC
It works as expected for me. This is how I tested the flow:

1. Created a cluster by providing 1 monitor machine & 1 OSD machine with 3 disks(1 SSD(vdb), 2 rotational(vdc,vdd)). The cluster was created successfully with 2 OSDs(2 rotational disks were OSDS) having the SSD as journal for these OSDs. Few details below:
  OSD-0 had following detail for its Journal map:

  "journal":{"available":30064771072,"journaldisk":"/dev/vdb","osdjournal":"","reweight":0,"size":35433480192,"ssd":true}

  OSD-1 had following details for journal map:

  "journal":{"available":24696061952,"journaldisk":"/dev/vdb","osdjournal":"","reweight":0,"size":35433480192,"ssd":true}


2. Added 1 new rotational disk(vde) to the OSD machine. The cluster got automatically expanded and a new OSD OSD-2 was created with SSD(vdb) as its journal.

  OSD-2 had following detail for journal map:

  "journal":{"available":19327352832,"journaldisk":"/dev/vdb","osdjournal":"","reweight":0,"size":35433480192,"ssd":true}

3. Added 1 more new rotational disk(vdf) to the OSD machine. The cluster got automatically expanded and a new OSD OSD-3 was created with SSD(vdb) as its journal.

  OSD-3 had following detail for journal map:

  "journal":{"available":13958643712,"journaldisk":"/dev/vdb","osdjournal":"","reweight":0,"size":35433480192,"ssd":true}

Now the cluster has 4 OSDs mapped to single SSD disk. The 2 later OSDs were created as part of DISK ADDITION later.

Please note that, the nodes collection in DB will not have an entry for the original SSD disk (vdb in my case) after cluster is created. Because once it gets partitioned we store only the partitions and not the parent. However I will be able to see the partitions of the SSD (in my case vdb1,vdb2,vdb3,vdb4).

Comment 20 Lubos Trilety 2016-07-18 14:44:19 UTC
(In reply to Darshan from comment #18)
> Could you please check the following values:
> 
> 1. After node is accepted successfully, check if the nodes collection in DB
> has marked the SSD Disks correctly.

{
"devname" : "/dev/vdb",
"fstype" : "",
"fsuuid" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAA=="),
"model" : "",
"mountpoint" : [
    ""
],
"name" : "/dev/vdb",
"parent" : "",
"size" : NumberLong("268435456000"),
"type" : "disk",
"used" : false,
"ssd" : true,
"vendor" : "0x1af4",
"storageprofile" : "ssd",
"diskid" : BinData(0,"ywL5rD7iRvyupR6zqBFYFg==")
}

> 
> 2. After successful cluster creation, Check the slus(OSDs) collection in the
> DB.
> For each Slu there will be options, under that there will be a map called
> "journal" of the form :
> {
> "journal Disk": name of journal disk,
> "ssd" : whether this is ssd or not(true/false),
> "size": The size of the journal disk,
> "available": The available size of this journal disk while this disk was
> created.
> }

Will provide later, cluster creation fails for me.

> 
> In above map check if "ssd" is TRUE for all the OSDs, and also check if the
> minimum value of "available" field among all OSD journal entries is greater
> than the journal size.
> 
> If ssd is True and "available"(least value for that journal disk) > journal
> size then new disk addition will use this ssd disk as journal.

Comment 21 Darshan 2016-07-19 05:26:20 UTC
Moving It to on_qa as it works for me. I tested it yesterday and it works as expected, have mentioned the way I tested in comment 19.

Comment 22 Lubos Trilety 2016-07-19 12:27:57 UTC
(In reply to Lubos Trilety from comment #20)
> (In reply to Darshan from comment #18)
> > Could you please check the following values:
> > 
> > 1. After node is accepted successfully, check if the nodes collection in DB
> > has marked the SSD Disks correctly.
> 
> {
> "devname" : "/dev/vdb",
> "fstype" : "",
> "fsuuid" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAA=="),
> "model" : "",
> "mountpoint" : [
>     ""
> ],
> "name" : "/dev/vdb",
> "parent" : "",
> "size" : NumberLong("268435456000"),
> "type" : "disk",
> "used" : false,
> "ssd" : true,
> "vendor" : "0x1af4",
> "storageprofile" : "ssd",
> "diskid" : BinData(0,"ywL5rD7iRvyupR6zqBFYFg==")
> }
> 
> > 
> > 2. After successful cluster creation, Check the slus(OSDs) collection in the
> > DB.
> > For each Slu there will be options, under that there will be a map called
> > "journal" of the form :
> > {
> > "journal Disk": name of journal disk,
> > "ssd" : whether this is ssd or not(true/false),
> > "size": The size of the journal disk,
> > "available": The available size of this journal disk while this disk was
> > created.
> > }
> 
> Will provide later, cluster creation fails for me.
> 
> > 
> > In above map check if "ssd" is TRUE for all the OSDs, and also check if the
> > minimum value of "available" field among all OSD journal entries is greater
> > than the journal size.
> > 
> > If ssd is True and "available"(least value for that journal disk) > journal
> > size then new disk addition will use this ssd disk as journal.

{
...
	"name" : "osd.0",
...
	"options" : {
...
		"device" : "/dev/vdd",
		"journal" : {
			"journaldisk" : "/dev/vdb",
			"ssd" : true,
...
			"available" : NumberLong("257698037760")
		},
...
}
{
...
	"name" : "osd.1",
...
	"options" : {
...
		"device" : "/dev/vdc",
		"journal" : {
			"journaldisk" : "/dev/vdb",
			"ssd" : true,
...
			"available" : NumberLong("263066746880")
		},
...
}

Will check it again. Anyway as You can see it is marked as ssd and it has enough space for other osd-disk.

Comment 23 Lubos Trilety 2016-07-22 15:21:57 UTC
Tested on:
rhscon-core-0.0.34-1.el7scon.x86_64
rhscon-ui-0.0.48-1.el7scon.noarch
rhscon-core-selinux-0.0.34-1.el7scon.noarch
rhscon-ceph-0.0.33-1.el7scon.x86_64

Comment 25 errata-xmlrpc 2016-08-23 19:47:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2016:1754