Bug 1265973
Summary: | After an upgrade from 1.1 to 1.3 through 1.2.3, OSD process is crashing. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | shilpa <smanjara> | ||||||
Component: | RADOS | Assignee: | Samuel Just <sjust> | ||||||
Status: | CLOSED ERRATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 1.3.0 | CC: | ceph-eng-bugs, dzafman, flucifre, kchai, kdreyer, sjust, vakulkar | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 1.3.1 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | ceph-0.94.3-2.el7cp (RHEL) Ceph v0.94.3.2 (Ubuntu) | Doc Type: | Known Issue | ||||||
Doc Text: |
OSD fails after upgrading from Ceph version 1.1 to 1.3
When Ceph version 1.3 creates a new Object Storage Device (OSD) on a Ceph cluster where monitors still have maps created with Ceph version 1.1, or the new OSD has not communicated with a monitor since upgrading to version 1.1, the OSD process terminates unexpectedly.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-11-23 20:22:44 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1230323 | ||||||||
Attachments: |
|
Comment 6
Kefu Chai
2015-09-24 10:57:01 UTC
ktdreyer suspects that there could be some special config option which could trigger the problem. cause we have firefly x hammer upgrade test suites already in ceph-qa-suites. Sam, would you mind looking into this OSD crash, or else re-assigning as appropriate? sam, upgrade/restart the monitors one after another, then upgrade and restart the OSDs one after another. this is what i learnt from Shilpa. Created attachment 1076658 [details]
10.12.27.11 monstore
Created attachment 1076829 [details]
Reproducer yaml
From Sam in #rh-ceph today: This bug is caused by starting a new hammer OSD on a cluster where the mons still have maps created in dumpling. Alternatively, start a hammer OSD which has not spoken to a mon since dumpling. We are keeping this in the 1.3.1. release. QE should to re-do the test, making sure that they bring their 1.1 -> 1.2 cluster up to "active+clean" before proceeding to Hammer, per Ken. Added to Known Issue tracker (1262054) for the Doc team to add to release notes. re-assigning to correct known-issue tracker (1.3.0). Thanks Harish, good eyes! Verified on ceph-0.94.3-3.el7cp.x86_64 Upgraded from 1.1(RHEL6.7) - > 1.2.3 -> 1.3.1(RHEL 7.1) No crashes found. I/O's are running fine. # ceph health HEALTH_OK Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2512 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2066 |