Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1488149

Summary: [ceph-container] - dmcrypt - osds failed to start after node reboot
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: ContainerAssignee: Sébastien Han <shan>
Status: CLOSED CURRENTRELEASE QA Contact: Harish NV Rao <hnallurv>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.4CC: dang, hchen, jim.curtis, jschluet, kdreyer, pprakash, seb, shan, tserlin
Target Milestone: rc   
Target Release: 2.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-2-rhel-7-docker-candidate-30274-20170914211241 Doc Type: Bug Fix
Doc Text:
.Encrypted containerized OSDs starts as expected after a reboot Encrypted containerized OSD daemons failed to start after a reboot. In addition, the following log message was added to the OSD log file: ---- filestore(/var/lib/ceph/osd/bb-1) mount failed to open journal /var/lib/ceph/osd/bb-1/journal: (2) No such file or directory ---- This bug has been fixed, and such OSDs start as expected in this situation.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-30 15:28:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1503598    
Bug Blocks: 1473436, 1479701    
Attachments:
Description Flags
File contains log snippet of an OSD service none

Description Vasishta 2017-09-04 12:59:00 UTC
Created attachment 1321841 [details]
File contains log snippet of an OSD service

Description of problem:

After upgrading cluster (OSDs - dmcrypt & dedicated journal) from 2.3 to 2.4, OSDs are failing to get started on rebooting the OSD node, saying mount 'failed to open journal'. 

Not sure whether this issue is dependent on upgrade or not, as upgrade worked fine. OSD services started and was running fine after following upgrade procedure (service start).

Version-Release number of selected component (if applicable):
ceph version 10.2.7-32.el7cp
brew-pulp-docker01.web.<-->:8888/rhceph:2.4

How reproducible:
Always (2/2)

Steps to Reproduce:
1. Upgrade a containerized ceph cluster with encrypted OSDs and dedicated journals from 2.3 to 2.4
2. Reboot an OSD Node 


Actual results:
OSD services are not getting started after node reboot
Log snippet - filestore(/var/lib/ceph/osd/bb-1) mount failed to open journal /var/lib/ceph/osd/bb-1/journal: (2) No such file or directory

Expected results:
OSD services must get started after node reboot

Comment 3 seb 2017-09-05 13:11:49 UTC
There is fix for this in 3.0.
Not sure how I can do a backport for this.

@Ken, which branch should I use to backport a fix?
Thanks!

Comment 5 seb 2017-09-05 14:13:14 UTC
lgtm.

Comment 10 Sébastien Han 2017-10-18 09:03:14 UTC
LGTM, thanks

Comment 11 Jon Schlueter 2017-10-18 14:14:35 UTC
Latest build of ceph2 container image is broken rhceph-rhel7-docker-2.4-3 see bug 1503598  The revert was not sufficient and complete to get us back to something that works.


$ docker run  -it  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:2.4-3  version
common_functions.sh: line 3: disk_list.sh: No such file or directory

Comment 14 Ken Dreyer (Red Hat) 2018-05-30 15:28:13 UTC
please reopen if this is still an issue