Bug 1644265

Summary: ceph-volume simple command failing for Containerized cluster
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ramakrishnan Periyasamy <rperiyas>
Component: Ceph-VolumeAssignee: Alfredo Deza <adeza>
Status: CLOSED ERRATA QA Contact: Parikshith <pbyregow>
Severity: high Docs Contact:
Priority: high    
Version: 3.2CC: adeza, ceph-eng-bugs, ceph-qe-bugs, hnallurv, pasik, rperiyas, tserlin
Target Milestone: rc   
Target Release: 3.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.8-32.el7cp Ubuntu: ceph_12.2.8-31redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Bug report ceph version: ansible-2.6.6-1.el7ae.noarch ceph-ansible-3.2.0-0.1.beta9.el7cp.noarch ceph version 12.2.8-23.el7cp (aa5600d395653e8260105d6a44e140b77d79f952) luminous (stable)
Last Closed: 2019-01-03 19:02:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ramakrishnan Periyasamy 2018-10-30 10:39:18 UTC
Description of problem:
configured ceph-disk based containerized cluster and tried to migrate to ceph-volume and observed ceph-volume simple command not working.

Command outputs:
------------------
[ubuntu@magna060 ~]$ sudo docker ps
CONTAINER ID        IMAGE                                                                                                                   COMMAND             CREATED             STATUS              PORTS               NAMES
7b63937778ee        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.2-rhel-7-containers-candidate-46791-20181026171445   "/entrypoint.sh"    About an hour ago   Up About an hour                        ceph-osd-magna060-sdb
4f253803614f        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.2-rhel-7-containers-candidate-46791-20181026171445   "/entrypoint.sh"    About an hour ago   Up About an hour                        ceph-osd-magna060-sdc
[ubuntu@magna060 ~]$ sudo docker exec -it ceph-osd-magna060-sdb bash
[root@magna060 /]# lsblk
NAME                                     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                        8:0    0 931.5G  0 disk  
└─sda1                                     8:1    0 931.5G  0 part  /var/lib/ceph
sdb                                        8:16   0 931.5G  0 disk  
├─sdb1                                     8:17   0   100M  0 part  
│ └─26caa252-66a1-490d-8374-08874d3b2dda 253:0    0    98M  0 crypt /var/lib/ceph/osd/ceph-3
├─sdb2                                     8:18   0 931.4G  0 part  
│ └─166dfd82-7e5a-46be-9239-401937f9a8ad 253:2    0 931.4G  0 crypt 
└─sdb5                                     8:21   0    10M  0 part  /var/lib/ceph/osd-lockbox/26caa252-66a1-490d-8374-08874d3b2dda
sdc                                        8:32   0 931.5G  0 disk  
├─sdc1                                     8:33   0   100M  0 part  
│ └─f90f4093-ea5d-4d4e-b2e7-bc28d21f5018 253:1    0    98M  0 crypt 
├─sdc2                                     8:34   0 931.4G  0 part  
│ └─9054fad8-fb19-4035-8942-0445bf2374bd 253:3    0 931.4G  0 crypt 
└─sdc5                                     8:37   0    10M  0 part  
sdd                                        8:48   0 931.5G  0 disk  
├─sdd1                                     8:49   0     1G  0 part  
│ └─693a30fa-20c4-4ee1-89bd-2c24e198200e 253:4    0  1022M  0 crypt 
├─sdd2                                     8:50   0   576M  0 part  
│ └─889d6fdf-0034-4d6e-aa41-43588aaaada6 253:6    0   574M  0 crypt 
├─sdd3                                     8:51   0     1G  0 part  
│ └─5c8c69e0-aac5-432b-917c-d140c7bdf93d 253:5    0  1022M  0 crypt 
└─sdd4                                     8:52   0   576M  0 part  
  └─0f9b0107-0fc9-45d0-ba90-ad742c8b213a 253:7    0   574M  0 crypt 

[root@magna060 /]# ceph-volume simple scan /dev/sdb1
Running command: /usr/sbin/cryptsetup status /dev/sdb1
 stderr: Device sdb1 not found
-->  RuntimeError: Lockbox partition was not found for device: /dev/sdb1

[root@magna060 /]# ceph-volume simple scan /var/lib/ceph/osd/ceph-3
 stderr: lsblk: /var/lib/ceph/osd/ceph-3: not a block device
 stderr: lsblk: /var/lib/ceph/osd/ceph-3: not a block device
Running command: /usr/sbin/cryptsetup status /dev/mapper/26caa252-66a1-490d-8374-08874d3b2dda
Running command: /usr/sbin/cryptsetup status 26caa252-66a1-490d-8374-08874d3b2dda
Running command: mount -v  /tmp/tmp69BA66
 stderr: mount:  is write-protected, mounting read-only
 stderr: mount: unknown filesystem type '(null)'
-->  RuntimeError: command returned non-zero exit status: 32
[root@magna060 /]# 


Execution of ceph-volume simple command outside the container
--------------------------------------------------------------
[ubuntu@magna060 ~]$ lsblk
NAME                                     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                        8:0    0 931.5G  0 disk  
└─sda1                                     8:1    0 931.5G  0 part  /
sdb                                        8:16   0 931.5G  0 disk  
├─sdb1                                     8:17   0   100M  0 part  
│ └─26caa252-66a1-490d-8374-08874d3b2dda 253:0    0    98M  0 crypt 
├─sdb2                                     8:18   0 931.4G  0 part  
│ └─166dfd82-7e5a-46be-9239-401937f9a8ad 253:2    0 931.4G  0 crypt 
└─sdb5                                     8:21   0    10M  0 part  
sdc                                        8:32   0 931.5G  0 disk  
├─sdc1                                     8:33   0   100M  0 part  
│ └─f90f4093-ea5d-4d4e-b2e7-bc28d21f5018 253:1    0    98M  0 crypt 
├─sdc2                                     8:34   0 931.4G  0 part  
│ └─9054fad8-fb19-4035-8942-0445bf2374bd 253:3    0 931.4G  0 crypt 
└─sdc5                                     8:37   0    10M  0 part  
sdd                                        8:48   0 931.5G  0 disk  
├─sdd1                                     8:49   0     1G  0 part  
│ └─693a30fa-20c4-4ee1-89bd-2c24e198200e 253:4    0  1022M  0 crypt 
├─sdd2                                     8:50   0   576M  0 part  
│ └─889d6fdf-0034-4d6e-aa41-43588aaaada6 253:6    0   574M  0 crypt 
├─sdd3                                     8:51   0     1G  0 part  
│ └─5c8c69e0-aac5-432b-917c-d140c7bdf93d 253:5    0  1022M  0 crypt 
└─sdd4                                     8:52   0   576M  0 part  

[ubuntu@magna060 ~]$ sudo docker exec ceph-osd-magna060-sdb ceph-volume simple scan /dev/sdb1
-->  RuntimeError: Lockbox partition was not found for device: /dev/sdb1
Running command: /usr/sbin/cryptsetup status /dev/sdb1
 stderr: Device sdb1 not found
[ubuntu@magna060 ~]$


Version-Release number of selected component (if applicable):
ansible-2.6.6-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.beta9.el7cp.noarch
ceph version 12.2.8-23.el7cp (aa5600d395653e8260105d6a44e140b77d79f952) luminous (stable)

How reproducible:
2/2

Steps to Reproduce:
1. Configured containerized cluster with ceph-disk osd 
2. Run ceph-volume simple scan command inside or outside the container.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Alfredo Deza 2018-10-30 11:32:59 UTC
ceph-volume relies on lsblk to capture the PARTLABEL value. In one other issue with containers (https://tracker.ceph.com/issues/36098) , we saw that lsblk would fail to capture this value. I think that is why the lockbox is failing to get detected.

Since the OSD is running/mounted, could you try scanning the directory instead? That should work around this issue.

In this case, the directory would be: /var/lib/ceph/osd/ceph-3

Comment 4 Alfredo Deza 2018-11-02 18:06:38 UTC
Ramakrishnan can you run the following commands and paste the output so that we can confirm that it is indeed the lsblk command failing?

> sudo lsblk -P /dev/sdb1 -o PARTLABEL


And also:

> sudo blkid /dev/sdb1

Comment 19 errata-xmlrpc 2019-01-03 19:02:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020