Bug 1630430 - OSP 11 -> 12 upgrade with ceph - Switch to containerized ceph daemons fails with more than 99 OSDs
Summary: OSP 11 -> 12 upgrade with ceph - Switch to containerized ceph daemons fails w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z1
: 3.1
Assignee: Sébastien Han
QA Contact: Vasishta
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1584264
TreeView+ depends on / blocked
 
Reported: 2018-09-18 15:46 UTC by Matt Flusche
Modified: 2021-12-10 17:39 UTC (History)
15 users (show)

Fixed In Version: RHEL: ceph-ansible-3.1.9-1.el7cp Ubuntu: ceph-ansible_3.1.9-2redhat1
Doc Type: Bug Fix
Doc Text:
.Ansible successfully converts non-containerized Ceph deployments containing more than 99 OSDs to containers The `ceph-ansible` utility failed to convert bare-metal Ceph Storage clusters that contained more than 99 OSDs to containers because of insufficient regular expressions used in the `switch-from-non-containerized-to-containerized-ceph-daemons.yml` playbook. The playbook has been updated, and converting non-containerized clusters to containerized works as expected.
Clone Of:
Environment:
Last Closed: 2018-11-09 01:00:34 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 3171 0 None closed switch: allow switch big clusters (more than 99 osds) 2020-02-25 15:18:27 UTC
Red Hat Issue Tracker RHCEPH-2712 0 None None None 2021-12-10 17:39:17 UTC
Red Hat Product Errata RHBA-2018:3530 0 None None None 2018-11-09 01:01:29 UTC

Description Matt Flusche 2018-09-18 15:46:04 UTC
Description of problem:

https://github.com/ceph/ceph-ansible/issues/3128

Change required to "collect running osds and ceph-disk unit(s)":

systemctl list-units | grep "loaded active" | grep -Eo 'ceph-osd@[0-9]{1,3}.service|ceph-disk@dev-[a-z]{3,4}[0-9]{1}.service'

Comment 4 Matt Wisch 2018-09-18 16:00:42 UTC
{1,3} is just what I used to workaround the issue with 200 OSDs, but {1,} may be better to avoid capping it artificially again.

Comment 5 Sébastien Han 2018-10-12 15:34:41 UTC
Present in https://github.com/ceph/ceph-ansible/releases/tag/v3.0.46

Comment 16 Vasishta 2018-10-29 10:43:46 UTC
Hi,

By looking at fix[1], we can observe that there has been just change in a Regular Expression as part of fix.
I think we can check whether the updated RE can parse numbers above 99, unlike old RE.

There was similar bug - BZ 1612854 , Which was verified in a similar manner.

Regards,
Vasishta Shastry
QE, Ceph

Comment 17 Vasishta 2018-10-29 13:01:13 UTC
As mentioned plan in comment 16, 

Added osd service names with IDs from 1-10 and 100-110 to a test file, tried to parse all names with old RE, only 1-10 were parsed, using new RE we could parse all service names.

$ for i in {0..10};do echo ceph-osd@$i.service >> test_file ;done
$ for i in {100..110};do echo ceph-osd@$i.service >> test_file ;done

$  grep -Eo 'ceph-osd@[0-9]{1,2}.service' test_file
ceph-osd
.
.
ceph-osd


$ grep -Eo 'ceph-osd@[0-9]+.service' test_file
ceph-osd
.
.
ceph-osd
ceph-osd
.
.
ceph-osd


We are planning to move this BZ to VERIFIED state on 30 OCT morning (IST).
Please let us know if there are any concerns/suggestions.

Regards,
Vasishta Shastry
QE, Ceph

Comment 20 Sébastien Han 2018-10-31 10:45:04 UTC
lgtm Bara thanks!

Comment 22 errata-xmlrpc 2018-11-09 01:00:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3530


Note You need to log in before you can comment on or make changes to this bug.