1624962 – [RFE] Set flag noup during scaleout and unset it when all new OSD's daemons are running

Bug 1624962 - [RFE] Set flag noup during scaleout and unset it when all new OSD's daemons are running

Summary: [RFE] Set flag noup during scaleout and unset it when all new OSD's daemons a...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	3.2
Assignee:	Sébastien Han
QA Contact:	Vasishta
Docs Contact:	Bara Ancincova
URL:
Whiteboard:
Depends On:	1651060
Blocks:	1629656
TreeView+	depends on / blocked

Reported:	2018-09-03 17:18 UTC by Vikhyat Umrao
Modified:	2021-12-10 17:19 UTC (History)
CC List:	12 users (show)
Fixed In Version:	RHEL: ceph-ansible-3.2.0-0.1.beta6.el7cp Ubuntu: ceph-ansible_3.2.0~beta6-2redhat1
Doc Type:	Enhancement
Doc Text:	.The `noup` flag is now set before creating OSDs to distribute PGs properly The `ceph-ansible` utility now sets the `noup` flag before creating OSDs to prevent them from changing their status to `up` before all OSDs are created. Previously, if the flag was not set, placement groups (PGs) were created on only one OSD and got stuck in creation or activation. With this update, the `noup` flag is set before creating OSDs and unset after the creation is complete. As a result, PGs are distributed properly among all OSDs.
Clone Of:
Environment:
Last Closed:	2019-01-03 19:01:53 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	ceph ceph-ansible pull 3175	None	closed	Day 2 playbooks	2020-10-22 14:44:19 UTC
Red Hat Issue Tracker	RHCEPH-2689	None	None	None	2021-12-10 17:19:34 UTC
Red Hat Product Errata	RHBA-2019:0020	None	None	None	2019-01-03 19:02:06 UTC

Description Vikhyat Umrao 2018-09-03 17:18:51 UTC

Description of problem:
[RFE] Set flag noup during scaleout and unset it when all new OSD's daemons are running

Version-Release number of selected component (if applicable):
RHCS 3

How reproducible:
Ansible adds OSD's one by one and if this noup flag is not set it causes a lot of PG's to get created to only one OSD and pgs get stuck in creation/activation.

This feature will help all the new OSD's to come online and then when noup would be unset then pg distribution will happen properly.

Comment 1 Vikhyat Umrao 2018-09-03 17:21:20 UTC

Why PG get stuck in activating you can read this KCS - https://access.redhat.com/solutions/3526531 This is a new feature in Luminous(RHCS 3) to avoid a large number of pgs to get mapped to one OSD.

Comment 3 mamccoma 2018-09-21 19:01:56 UTC

*** Test from lab environment with "noup" flag ***:

Before any changes in my environment (baseline):

[root@vm250-137 ~]# ceph -s
  cluster:
    id:     256b60c8-8d8e-47bb-9dfe-492055072a7e
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
            1/3 mons down, quorum vm250-8,vm250-137
 
  services:
    mon:         3 daemons, quorum vm250-8,vm250-137, out of quorum: vm250-194
    mgr:         vm250-137(active), standbys: vm250-8
    osd:         9 osds: 9 up, 9 in
    rgw:         2 daemons active
    tcmu-runner: 2 daemons active
 
  data:
    pools:   9 pools, 576 pgs
    objects: 231 objects, 3873 bytes
    usage:   1112 MB used, 87876 MB / 88988 MB avail
    pgs:     576 active+clean
 
  io:
    client:   170 B/s rd, 0 op/s rd, 0 op/s wr


[root@vm250-137 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
-1       0.08464 root default                               
-5       0.02888     host vm250-248                         
 1   hdd 0.00929         osd.1          up  1.00000 1.00000 
 3   hdd 0.00980         osd.3          up  1.00000 1.00000 
 6   hdd 0.00980         osd.6          up  1.00000 1.00000 
-7       0.02788     host vm251-254                         
 2   hdd 0.00929         osd.2          up  1.00000 1.00000 
 5   hdd 0.00929         osd.5          up  1.00000 1.00000 
 8   hdd 0.00929         osd.8          up  1.00000 1.00000 
-3       0.02788     host vm253-212                         
 0   hdd 0.00929         osd.0          up  1.00000 1.00000 
 4   hdd 0.00929         osd.4          up  1.00000 1.00000 
 7   hdd 0.00929         osd.7          up  1.00000 1.00000 
--------------------------------------------------------------------------------

** Remove OSDs/OSD node (vm251-254) and apply flags to simulate adding a new node with OSDs:

[root@vm250-137 ~]# ceph -s
  cluster:
    id:     256b60c8-8d8e-47bb-9dfe-492055072a7e
    health: HEALTH_WARN
            noup,nobackfill,norecover flag(s) set
            55/693 objects misplaced (7.937%)
            Degraded data redundancy: 176/693 objects degraded (25.397%), 352 pgs unclean, 21 pgs degraded, 352 pgs undersized
            application not enabled on 1 pool(s)
            1/3 mons down, quorum vm250-8,vm250-137
 
  services:
    mon:         3 daemons, quorum vm250-8,vm250-137, out of quorum: vm250-194
    mgr:         vm250-137(active), standbys: vm250-8
    osd:         6 osds: 6 up, 6 in; 224 remapped pgs
                 flags noup,nobackfill,norecover
    rgw:         2 daemons active
    tcmu-runner: 2 daemons active
 
  data:
    pools:   9 pools, 576 pgs
    objects: 231 objects, 3873 bytes
    usage:   750 MB used, 59087 MB / 59837 MB avail
    pgs:     176/693 objects degraded (25.397%)
             55/693 objects misplaced (7.937%)
             331 active+undersized
             214 active+clean+remapped
             21  active+undersized+degraded
             10  active+clean
 
  io:
    client:   127 B/s rd, 0 op/s rd, 0 op/s wr


[root@vm250-137 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
-1       0.05676 root default                               
-5       0.02888     host vm250-248                         
 1   hdd 0.00929         osd.1          up  1.00000 1.00000 
 3   hdd 0.00980         osd.3          up  1.00000 1.00000 
 6   hdd 0.00980         osd.6          up  1.00000 1.00000 
-3       0.02788     host vm253-212                         
 0   hdd 0.00929         osd.0          up  1.00000 1.00000 
 4   hdd 0.00929         osd.4          up  1.00000 1.00000 
 7   hdd 0.00929         osd.7          up  1.00000 1.00000 
-------------------------------------------------------------------------------

** ceph-ansible playbook fails on this non-containerized task at the end of the playbook?? but still appears to be successful in applying the changes **

TASK [ceph-osd : manually prepare ceph "filestore" non-containerized osd disk(s) with collocated osd data and journal] *****
changed: [vm251-254] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:00.954490', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:00.910970', u'delta': u'0:00:00.043520', 'item': u'/dev/sdb', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdb'])
changed: [vm251-254] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:01.467960', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:01.408488', u'delta': u'0:00:00.059472', 'item': u'/dev/sdc', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdc'])
changed: [vm251-254] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:02.006833', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:01.959515', u'delta': u'0:00:00.047318', 'item': u'/dev/sdd', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdd'])
failed: [vm251-254] (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:02.437479', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:02.417113', u'delta': u'0:00:00.020366', 'item': u'/dev/sdd', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdd']) => {"changed": true, "cmd": ["ceph-disk", "prepare", "--cluster", "ceph", "--filestore", "/dev/sdd"], "delta": "0:00:01.538766", "end": "2018-09-21 14:44:46.513581", "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.020366", "end": "2018-09-21 14:44:02.437479", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/sdd print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/sdd", "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:02.417113", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/sdd"], "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:44.974815", "stderr": "Could not create partition 2 from 34 to 1048609\nError encountered; not saving changes.\n'/sbin/sgdisk --new=2:0:+512M --change-name=2:ceph journal --partition-guid=2:e07fb99d-f87e-4a44-aa6b-e6f466f7aef2 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdd' failed with status code 4", "stderr_lines": ["Could not create partition 2 from 34 to 1048609", "Error encountered; not saving changes.", "'/sbin/sgdisk --new=2:0:+512M --change-name=2:ceph journal --partition-guid=2:e07fb99d-f87e-4a44-aa6b-e6f466f7aef2 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdd' failed with status code 4"], "stdout": "", "stdout_lines": []}
failed: [vm251-254] (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:02.846679', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:02.830390', u'delta': u'0:00:00.016289', 'item': u'/dev/sdb', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdb']) => {"changed": true, "cmd": ["ceph-disk", "prepare", "--cluster", "ceph", "--filestore", "/dev/sdb"], "delta": "0:00:00.308885", "end": "2018-09-21 14:44:47.732516", "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.016289", "end": "2018-09-21 14:44:02.846679", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/sdb", "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:02.830390", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/sdb"], "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:47.423631", "stderr": "ceph-disk: Error: Device is mounted: /dev/sdb1", "stderr_lines": ["ceph-disk: Error: Device is mounted: /dev/sdb1"], "stdout": "", "stdout_lines": []}
failed: [vm251-254] (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", u'end': u'2018-09-21 14:44:03.289726', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2018-09-21 14:44:03.273331', u'delta': u'0:00:00.016395', 'item': u'/dev/sdc', u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u'', '_ansible_ignore_errors': None, u'failed': False}, u'/dev/sdc']) => {"changed": true, "cmd": ["ceph-disk", "prepare", "--cluster", "ceph", "--filestore", "/dev/sdc"], "delta": "0:00:00.306034", "end": "2018-09-21 14:44:48.455621", "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.016395", "end": "2018-09-21 14:44:03.289726", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/sdc print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/sdc", "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:03.273331", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/sdc"], "msg": "non-zero return code", "rc": 1, "start": "2018-09-21 14:44:48.149587", "stderr": "ceph-disk: Error: Device is mounted: /dev/sdc1", "stderr_lines": ["ceph-disk: Error: Device is mounted: /dev/sdc1"], "stdout": "", "stdout_lines": []}

PLAY RECAP *****************************************************************************************************************
vm251-254                  : ok=63   changed=4    unreachable=0    failed=1   


--------------------------------------------------------------------------------

*** Changes after the run of the playbook ***

ID CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF
-1	 0.08464 root default
-5	 0.02888     host vm250-248
 1   hdd 0.00929         osd.1          up  1.00000 1.00000
 3   hdd 0.00980         osd.3          up  1.00000 1.00000
 6   hdd 0.00980         osd.6          up  1.00000 1.00000
-7	 0.02788     host vm251-254
 2   hdd 0.00929         osd.2        down        0 1.00000
 5   hdd 0.00929         osd.5        down        0 1.00000
 8   hdd 0.00929         osd.8        down        0 1.00000
-3	 0.02788     host vm253-212
 0   hdd 0.00929         osd.0          up  1.00000 1.00000
 4   hdd 0.00929         osd.4          up  1.00000 1.00000
 7   hdd 0.00929         osd.7          up  1.00000 1.00000



[root@vm250-137 ~]# ceph osd unset nobackfill
nobackfill is unset
[root@vm250-137 ~]# ceph osd unset norecover
norecover is unset
[root@vm250-137 ~]# ceph osd unset noup
noup is unset
[root@vm250-137 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME          STATUS REWEIGHT PRI-AFF 
-1       0.08464 root default                               
-5       0.02888     host vm250-248                         
 1   hdd 0.00929         osd.1          up  1.00000 1.00000 
 3   hdd 0.00980         osd.3          up  1.00000 1.00000 
 6   hdd 0.00980         osd.6          up  1.00000 1.00000 
-7       0.02788     host vm251-254                         
 2   hdd 0.00929         osd.2          up  1.00000 1.00000 
 5   hdd 0.00929         osd.5          up  1.00000 1.00000 
 8   hdd 0.00929         osd.8          up  1.00000 1.00000 
-3       0.02788     host vm253-212                         
 0   hdd 0.00929         osd.0          up  1.00000 1.00000 
 4   hdd 0.00929         osd.4          up  1.00000 1.00000 
 7   hdd 0.00929         osd.7          up  1.00000 1.00000


** Cluster has now successfully re-balanced **:

[root@vm250-137 ~]# ceph -s
  cluster:
    id:     256b60c8-8d8e-47bb-9dfe-492055072a7e
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
            1/3 mons down, quorum vm250-8,vm250-137
 
  services:
    mon:         3 daemons, quorum vm250-8,vm250-137, out of quorum: vm250-194
    mgr:         vm250-137(active), standbys: vm250-8
    osd:         9 osds: 9 up, 9 in
    rgw:         2 daemons active
    tcmu-runner: 2 daemons active
 
  data:
    pools:   9 pools, 576 pgs
    objects: 231 objects, 3873 bytes
    usage:   1090 MB used, 87898 MB / 88988 MB avail
    pgs:     576 active+clean
 
  io:
    client:   85 B/s rd, 0 op/s rd, 0 op/s wr

Comment 4 Sébastien Han 2018-09-25 13:24:09 UTC

Vikhyat, for day 2 operations it is encouraged to use the playbook osd-configure.yml, which will add new OSDs. So we are going to add your request in this playbook.

Comment 5 Sébastien Han 2018-10-17 15:18:04 UTC

Present in https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0beta6

Comment 10 Sébastien Han 2018-11-07 10:05:31 UTC

lgtm

Comment 11 Vasishta 2018-11-19 12:39:09 UTC

Observed that noup falg was set and unset as required.
Moving to VERIFIED state.

ceph-ansible-3.2.0-0.1.rc3.el7cp.noarch

Comment 13 errata-xmlrpc 2019-01-03 19:01:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020

Note You need to log in before you can comment on or make changes to this bug.