Bug 1561456
Summary: | [ceph-ansible] [ceph-container] : shrink OSD with NVMe disks - failing as OSD services are not stoppped | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Vasishta <vashastr> | ||||
Component: | Ceph-Ansible | Assignee: | Sébastien Han <shan> | ||||
Status: | CLOSED ERRATA | QA Contact: | Vasishta <vashastr> | ||||
Severity: | high | Docs Contact: | Erin Donnelly <edonnell> | ||||
Priority: | unspecified | ||||||
Version: | 3.0 | CC: | adeza, agunn, aschoen, ceph-eng-bugs, edonnell, gmeno, hnallurv, jquinn, kdreyer, nthomas, sankarshan, shan, tchandra | ||||
Target Milestone: | z3 | ||||||
Target Release: | 3.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHEL: ceph-ansible-3.0.32-1.el7cp Ubuntu: ceph-ansible_3.0.32-2redhat1 | Doc Type: | Bug Fix | ||||
Doc Text: |
.The `shrink-osd` playbook supports NVMe drives
Previously, the `shrink-osd` Ansible playbook did not support shrinking OSDs backed by an NVMe drive. NVMe drive support has been added in this release.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-05-15 18:20:31 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1553254, 1557269, 1572368, 1600697 | ||||||
Attachments: |
|
Can you check if the container is still running? Perhaps we tried to stop the wrong service. Thanks. Hi Sebastien, As I remember container was running. (Unfortunately I don't have environment as of now) I think we must have tried to stop wrong service as I see "name": "ceph-osd@nvme0n1p" in the log (atachment). By following the convention, service name must have been "ceph-osd@nvme0n1" The logic we have in shrink-osd.yml [1] to findout service seems to be not working for nvme disks. - name: stop osd services (container) service: name: "ceph-osd@{{ item.0.stdout[:-1] | regex_replace('/dev/', '') }}" I think It would have been fine if we could have "item.0.stdout[:-2]" only for nvme disks. [1] https://github.com/ceph/ceph-ansible/blob/37117071ebb7ab3cf68b607b6760077a2b46a00d/infrastructure-playbooks/shrink-osd.yml#L119-L121 Regards, Vasishta Shastry AQE, Ceph *** Bug 1555793 has been marked as a duplicate of this bug. *** Will be in the next release v3.0.32 working fine with ceph-ansible-3.0.32-1.el7cp Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1563 |
Created attachment 1414140 [details] File contains contents ansible-playbook log Description of problem: Shrinking of OSD with NVMe disks are failing in task deallocate osd(s) id when ceph-disk destroy fail saying "Error EBUSY: osd.<id> is still up; must be down before removal. " Version-Release number of selected component (if applicable): ceph-ansible-3.0.28-1.el7cp.noarch How reproducible: Always (3/3) Steps to Reproduce: 1. Configure containerized cluster with NVMe disks for OSDs 2. Try to shrink an OSD. Actual results: TASK [deallocate osd(s) id when ceph-disk destroy fail] -------------------------------- "stderr_lines": [ "Error EBUSY: osd.7 is still up; must be down before removal. " ], Expected results: OSD must be removed successfully Additional info: The task TASK [stop osd services (container)] was completed with status 'ok'.