Bug 1275631
Summary: | After replacing the disk, ceph osd tree still showing the old disk entry | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tanay Ganguly <tganguly> | ||||||||
Component: | RADOS | Assignee: | Samuel Just <sjust> | ||||||||
Status: | CLOSED DUPLICATE | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 1.3.1 | CC: | ceph-eng-bugs, dzafman, hnallurv, jdurgin, kchai, kdreyer, kurs, sjust | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | 1.3.1 | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2015-11-03 18:11:27 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Tanay Ganguly
2015-10-27 11:45:40 UTC
Sam, mind looking into this one (or re-assigning as appropriate?) Is this a bug the docs (replace-osds.adoc), or something else? Few more information: ceph -s output shows 29 OSD's but 28 up and in snippet: osdmap e695: 29 osds: 28 up, 28 in Calamari GUI also shows wrong information: OSD 28/29 In & Up 1 down PFA, the screenshot ( Dashboard and OSD Workbench) Created attachment 1087142 [details]
Calamari_GUI
Created attachment 1087143 [details]
Calamari_Dashboard
After i restart the newly added osd, i am seeing I/O error, as it still pointing to ceph-7 directory. But the old.22 is getting started, PFA the log of osd.22 [root@cephqe5 ~]# /etc/init.d/ceph stop osd.22 find: ‘/var/lib/ceph/osd/ceph-7’: Input/output error === osd.22 === Stopping Ceph osd.22 on cephqe5...kill 66039...kill 66039...done [root@cephqe5 ~]# /etc/init.d/ceph start osd.22 find: ‘/var/lib/ceph/osd/ceph-7’: Input/output error === osd.22 === create-or-move updated item name 'osd.22' weight 1.09 at location {host=cephqe5,root=default} to crush map Starting Ceph osd.22 on cephqe5... Running as unit run-71865.service. Created attachment 1087157 [details]
Log
I don't now why osd.7 wasn't found as the next open slot. I have not been able to reproduce this on v0.94.3. I even did rm/create multiple times in a tight loop to check for a race condition. If the customer created a new osd before removing the old one, that would explain this. Or if the customer created 2 new osds (osd.7/osd.22) and removed osd.7 a second time. Previous comment not important. Based on bug description, instructions are wrong because ceph-deploy must also be doing a "ceph osd create" New instructions are pending and I've noted a concern that the new instructions might lead to the same problem and how to fix it. I've added comment to https://bugzilla.redhat.com/show_bug.cgi?id=1210539 and marking duplicate. *** This bug has been marked as a duplicate of bug 1210539 *** |