1395171 – [ceph-ansible]: When roll upgrading a cluster from 2.0 to 2.1 with only encrypted OSD's, start OSD task fails

Bug 1395171 - [ceph-ansible]: When roll upgrading a cluster from 2.0 to 2.1 with only encrypted OSD's, start OSD task fails

Summary: [ceph-ansible]: When roll upgrading a cluster from 2.0 to 2.1 with only encry...

Keywords:
Status:	CLOSED DUPLICATE of bug 1366808
Alias:	None
Product:	Red Hat Storage Console
Classification:	Red Hat Storage
Component:	ceph-ansible
Sub Component:
Version:	2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	3
Assignee:	Sébastien Han
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-15 10:36 UTC by Tejas
Modified:	2020-02-14 18:08 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-15 11:50:41 UTC
Embargoed:

Attachments	(Terms of Use)
ansible playbook log (303.70 KB, text/plain) 2016-11-15 10:36 UTC, Tejas	no flags	Details
View All

Description Tejas 2016-11-15 10:36:40 UTC

Created attachment 1220799 [details]
ansible playbook  log

Description of problem:
      I have a ceph 2.0 cluster with colocated and dedicated encrypted OSD's. Using rolling update I am trying to upgrade to 2.1. Seeing an issue like this:
TASK: [start ceph osds (systemd)] ********************************************* 
failed: [magna056] => (item=0) => {"failed": true, "item": "0"}
msg: Job for ceph-osd failed because start of the service was attempted too often. See "systemctl status ceph-osd" and "journalctl -xe" for details.
To force a start use "systemctl reset-failed ceph-osd" followed by "systemctl start ceph-osd" again.

ok: [magna056] => (item=3)




Version-Release number of selected component (if applicable):
ceph-ansible-1.0.5-44.el7scon.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a ceph 2.0 cluster using ceph-ansible with 1 colocated encrypted OSD on each node.
2. Create  1 dedicated journal encrypted OSD using ceph-deploy since its not supported in ansible.
3. Try to upgrade the cluster to 2.1 using ceph-ansible.



Additional info:

I have attached the playbook log in this bug.
Cluster create was done using CDN with all the standard options.

devices:
  - /dev/sdb

dmcrypt_journal_collocation: true


root@magna013 ~]# ceph -s
    cluster b8b5f3c0-4da0-4601-8ad0-eae086af3723
     health HEALTH_WARN
            pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
     monmap e1: 3 mons at {magna003=10.8.128.3:6789/0,magna013=10.8.128.13:6789/0,magna023=10.8.128.23:6789/0}
            election epoch 22, quorum 0,1,2 magna003,magna013,magna023
     osdmap e66: 6 osds: 6 up, 6 in
            flags sortbitwise
      pgmap v8147: 152 pgs, 12 pools, 313 GB data, 83791 objects
            940 GB used, 4615 GB / 5556 GB avail
                 152 active+clean
  client io 6678 B/s rd, 43844 kB/s wr, 7 op/s rd, 107 op/s wr



OSD setup common to all the OSD nodes:

root@magna092 ~]# ceph-disk list
/dev/dm-0 other, unknown
/dev/dm-1 other, xfs, mounted on /var/lib/ceph/osd/ceph-1
/dev/dm-2 other, unknown
/dev/dm-3 other, xfs, mounted on /var/lib/ceph/osd/ceph-4
/dev/sda :
 /dev/sda1 other, ext4, mounted on /
/dev/sdb :
 /dev/sdb2 ceph journal (dmcrypt LUKS /dev/dm-0), for /dev/sdb1
 /dev/sdb3 ceph lockbox, active, for /dev/sdb1
 /dev/sdb1 ceph data (dmcrypt LUKS /dev/dm-1), cluster ceph, osd.1, journal /dev/sdb2
/dev/sdc :
 /dev/sdc3 ceph lockbox, active, for /dev/sdc1
 /dev/sdc1 ceph data (dmcrypt LUKS /dev/dm-3), cluster ceph, osd.4, journal /dev/sdd1
/dev/sdd :
 /dev/sdd1 ceph journal (dmcrypt LUKS /dev/dm-2), for /dev/sdc1
[root@magna092 ~]# 
[root@magna092 ~]# 
[root@magna092 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       917G  2.7G  868G   1% /
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  8.7M   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
tmpfs           3.2G     0  3.2G   0% /run/user/1000
/dev/sdb3       8.7M  179K  7.9M   3% /var/lib/ceph/osd-lockbox/7d213946-3019-4662-b2c7-85d650a05404
/dev/dm-1       922G  119G  803G  13% /var/lib/ceph/osd/ceph-1
/dev/sdc3       8.7M  179K  7.9M   3% /var/lib/ceph/osd-lockbox/605dfe9b-0d1d-45dc-a381-3dc5d318e07f
/dev/dm-3       932G  197G  735G  22% /var/lib/ceph/osd/ceph-4

Comment 2 Harish NV Rao 2016-11-15 11:50:41 UTC

Closing this as duplicate of 1366808 as same functionality will be tested through 1366808 when fixed.

*** This bug has been marked as a duplicate of bug 1366808 ***

Note You need to log in before you can comment on or make changes to this bug.