1666822 – ceph-volume does not always populate dictionary key rotational

Bug 1666822 - ceph-volume does not always populate dictionary key rotational

Summary: ceph-volume does not always populate dictionary key rotational

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Volume
Sub Component:
Version:	3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	3.3
Assignee:	Andrew Schoen
QA Contact:	Eliad Cohen
Docs Contact:	Bara Ancincova
URL:
Whiteboard:
Duplicates (1):	1674022 (view as bug list)
Depends On:
Blocks:	1578730 1726135
TreeView+	depends on / blocked

Reported:	2019-01-16 16:39 UTC by John Fulton
Modified:	2024-01-06 04:25 UTC (History)
CC List:	29 users (show)
Fixed In Version:	RHEL: ceph-12.2.12-26.el7cp Ubuntu: ceph_12.2.12-22redhat1xenial
Doc Type:	Bug Fix
Doc Text:	.`ceph-volume` can determine if a device is rotational or not even if the device is not in the `/sys/block/` directory If the device name did not exist in the `/sys/block/` directory, the `ceph-volume` utility could not acquire information on if a device was rotational or not. This was for example the case for loopback devices or devices listed in the `/dev/disk/by-path/` directory. Consequently, the `lvm batch` subcommand failed. with this update, `ceph-volume` uses the `lsblk` command to determine if a device is rotational if no information is found in `/sys/block/` for the given device. As a result, `lvm batch` works as expected in this case.
Clone Of:
Environment:
Last Closed:	2019-08-21 15:10:24 UTC
Embargoed:
Dependent Products:
Flags:	kholtz: needinfo-

Attachments	(Terms of Use)
Example workaround for OSPd (5.06 KB, application/gzip) 2019-03-29 03:01 UTC, John Fulton	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	ceph ceph pull 26957	'None'	closed	ceph-volume: look for rotational data in lsblk	2021-02-18 08:19:40 UTC
Github	ceph ceph pull 26989	'None'	closed	luminous: ceph-volume: look for rotational data in lsblk	2021-02-18 08:19:43 UTC
Github	ceph ceph pull 28060	'None'	closed	ceph-volume: use the Device.rotational property instead of sys_api	2021-02-18 08:19:40 UTC
Github	ceph ceph pull 28519	'None'	closed	luminous: ceph-volume: use the Device.rotational property instead of sys_api	2021-02-18 08:19:40 UTC
Red Hat Issue Tracker	RHCEPH-7346	None	None	None	2023-09-07 19:41:54 UTC
Red Hat Knowledge Base (Solution)	3954161	Configure	None	How do I control which devices are configured as WAL and DB devices when deploying Bluestore with OSPd?	2019-07-15 13:09:40 UTC
Red Hat Knowledge Base (Solution)	3974241	None	None	Overcloud ceph install fails with KeyError: 'rotational'	2019-07-16 20:37:59 UTC
Red Hat Product Errata	RHSA-2019:2538	None	None	None	2019-08-21 15:10:49 UTC

Description John Fulton 2019-01-16 16:39:32 UTC

When using ceph-volume with ceph-nautilus (dev) [1] via ceph-ansible master and passing a loopback device [2] for the disk my deployment fails because the generated dictionary does not container the key 'rotational'. Could ceph-volume please handle this case and populate the dictionary to have a 'rotational' key? I'm not sure if the value the key maps to matters for this case.

Though loopback devices shouldn't be used in production, OpenStack's TripleO CI system uses a loopback device to simulate a block device for ceph deployment and this issue prevents us getting our CI working with ceph-volume (it currently works with ceph-disk) with the limited resources we have.


[1]

[root@fultonj ~]# podman exec -ti 7224e510ead3 ceph --version
ceph version 14.0.1-2605-g6b17068 (6b170687d1b8ffc393eaf9194b615758049fcc40) nautilus (dev)
[root@fultonj ~]# 

[2]
    devices:
      - /dev/loop3

as created with:

dd if=/dev/zero of=/var/lib/ceph-osd.img bs=1 count=0 seek=7G
losetup /dev/loop3 /var/lib/ceph-osd.img
sgdisk -Z /dev/loop3

2019-01-16 16:16:10,638 p=265329 u=root |  TASK [ceph-osd : read information about the devices] ***************************
2019-01-16 16:16:10,638 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:24
2019-01-16 16:16:10,638 p=265329 u=root |  Wednesday 16 January 2019  16:16:10 +0000 (0:00:00.049)       0:01:16.293 ***** 
2019-01-16 16:16:10,820 p=265329 u=root |  Using module file /usr/lib/python3.6/site-packages/ansible/modules/system/parted.py
2019-01-16 16:16:12,025 p=265329 u=root |  ok: [fultonj] => (item=/dev/loop3) => changed=false 
  disk:
    dev: /dev/loop3
    logical_block: 512
    model: Loopback device
    physical_block: 512
    size: 7168.0
    table: unknown
    unit: mib
  invocation:
    module_args:
      align: optimal
      device: /dev/loop3
      flags: null
      label: msdos
      name: null
      number: null
      part_end: 100%
      part_start: 0%
      part_type: primary
      state: info
      unit: MiB
  item: /dev/loop3
  partitions: []
  script: unit 'MiB' print
2019-01-16 16:16:12,058 p=265329 u=root |  TASK [ceph-osd : include check_gpt.yml] ****************************************
2019-01-16 16:16:12,058 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:31
2019-01-16 16:16:12,058 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:01.420)       0:01:17.713 ***** 
2019-01-16 16:16:12,080 p=265329 u=root |  skipping: [fultonj] => changed=false 
  skip_reason: Conditional result was False
2019-01-16 16:16:12,112 p=265329 u=root |  TASK [ceph-osd : include_tasks scenarios/collocated.yml] ***********************
2019-01-16 16:16:12,113 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:36
2019-01-16 16:16:12,113 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:00.054)       0:01:17.768 ***** 
2019-01-16 16:16:12,129 p=265329 u=root |  skipping: [fultonj] => changed=false 
  skip_reason: Conditional result was False
2019-01-16 16:16:12,162 p=265329 u=root |  TASK [ceph-osd : include_tasks scenarios/non-collocated.yml] *******************
2019-01-16 16:16:12,162 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:41
2019-01-16 16:16:12,162 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:00.049)       0:01:17.817 ***** 
2019-01-16 16:16:12,180 p=265329 u=root |  skipping: [fultonj] => changed=false 
  skip_reason: Conditional result was False
2019-01-16 16:16:12,212 p=265329 u=root |  TASK [ceph-osd : include_tasks scenarios/lvm.yml] ******************************
2019-01-16 16:16:12,212 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:47
2019-01-16 16:16:12,212 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:00.050)       0:01:17.867 ***** 
2019-01-16 16:16:12,230 p=265329 u=root |  skipping: [fultonj] => changed=false 
  skip_reason: Conditional result was False
2019-01-16 16:16:12,262 p=265329 u=root |  TASK [ceph-osd : include_tasks scenarios/lvm-batch.yml] ************************
2019-01-16 16:16:12,263 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/main.yml:55
2019-01-16 16:16:12,263 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:00.050)       0:01:17.918 ***** 
2019-01-16 16:16:12,315 p=265329 u=root |  included: /home/stack/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm-batch.yml for fultonj
2019-01-16 16:16:12,359 p=265329 u=root |  TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] ***********
2019-01-16 16:16:12,359 p=265329 u=root |  task path: /home/stack/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm-batch.yml:3
2019-01-16 16:16:12,360 p=265329 u=root |  Wednesday 16 January 2019  16:16:12 +0000 (0:00:00.096)       0:01:18.015 ***** 
2019-01-16 16:16:12,545 p=265329 u=root |  Using module file /home/stack/ceph-ansible/library/ceph_volume.py
2019-01-16 16:16:14,325 p=265329 u=root |  The full traceback is:
  File "/tmp/ansible_ceph_volume_payload_2k1mkdqh/__main__.py", line 602, in run_module
    report_result = json.loads(out)
  File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None

2019-01-16 16:16:14,329 p=265329 u=root |  fatal: [fultonj]: FAILED! => changed=true 
  cmd:
  - podman
  - run
  - --rm
  - --privileged
  - --net=host
  - -v
  - /run/lock/lvm:/run/lock/lvm:z
  - -v
  - /var/run/udev/:/var/run/udev/:z
  - -v
  - /dev:/dev
  - -v
  - /etc/ceph:/etc/ceph:z
  - -v
  - /run/lvm/:/run/lvm/
  - -v
  - /var/lib/ceph/:/var/lib/ceph/:z
  - -v
  - /var/log/ceph/:/var/log/ceph/:z
  - --entrypoint=ceph-volume
  - docker.io/ceph/daemon:latest-master
  - --cluster
  - ceph
  - lvm
  - batch
  - --bluestore
  - --yes
  - --prepare
  - /dev/loop3
  - --report
  - --format=json
  invocation:
    module_args:
      action: batch
      batch_devices:
      - /dev/loop3
      block_db_size: '-1'
      cluster: ceph
      containerized: 'False'
      crush_device_class: ''
      data: null
      data_vg: null
      db: null
      db_vg: null
      dmcrypt: false
      journal: null
      journal_size: '5120'
      journal_vg: null
      objectstore: bluestore
      osds_per_device: 1
      report: false
      wal: null
      wal_vg: null
  msg: non-zero return code
  rc: 1
  stderr: '-->  KeyError: ''rotational'''
  stderr_lines:
  - '-->  KeyError: ''rotational'''
  stdout: ''
  stdout_lines: <omitted>
2019-01-16 16:16:14,330 p=265329 u=root |  NO MORE HOSTS LEFT *************************************************************
2019-01-16 16:16:14,330 p=265329 u=root |  PLAY RECAP *********************************************************************
2019-01-16 16:16:14,331 p=265329 u=root |  fultonj                    : ok=197  changed=5    unreachable=0    failed=1

Comment 1 Alfredo Deza 2019-01-16 18:35:45 UTC

I understand the need for using loop back devices, but these aren't supported for ceph-volume and I don't foresee adding that as a feature.

However, there are a couple of things here that should be noted:

1) this is still a bug, where ceph-volume is trusting that device objects will always have the "rotational" flag, the device would still be rejected but
with an error message (vs. a traceback like today)

2) it is possible to get ceph-volume to work with loop devices and save resources, this is how ceph-volume is able to test rotational+NVMe devices for example. In short:
 - finds an available loop device
 - creates a sparse file
 - attaches the sparse file onto the loop device
 - tells NVMe to make a target out of it

The last portion sets everything right with the kernel recognizing the loop device as a new NVMe device. The playbook is at:

https://github.com/ceph/ceph/blob/master/src/ceph-volume/ceph_volume/tests/functional/batch/playbooks/setup_mixed_type.yml

Comment 3 John Fulton 2019-02-14 19:49:15 UTC

(In reply to Alfredo Deza from comment #1)
> I understand the need for using loop back devices, but these aren't
> supported for ceph-volume and I don't foresee adding that as a feature.
> 
> However, there are a couple of things here that should be noted:
> 
> 1) this is still a bug, where ceph-volume is trusting that device objects
> will always have the "rotational" flag, the device would still be rejected
> but with an error message (vs. a traceback like today)

OK, I'm fine with you using this bug to solve the above issue if that's what you'd like to do. 

> 2) it is possible to get ceph-volume to work with loop devices and save
> resources, this is how ceph-volume is able to test rotational+NVMe devices
> for example. In short:
>  - finds an available loop device
>  - creates a sparse file
>  - attaches the sparse file onto the loop device
>  - tells NVMe to make a target out of it
> 
> The last portion sets everything right with the kernel recognizing the loop
> device as a new NVMe device. The playbook is at:
> 
> https://github.com/ceph/ceph/blob/master/src/ceph-volume/ceph_volume/tests/
> functional/batch/playbooks/setup_mixed_type.yml

Thanks, that's a nice trick to simulate having NVMe devices so that I could continue to use 'ceph-volume batch' on loopback devices.

For TripleO CI I found another way to use loopback devices without using the deprecated [1] collocated or non-collocated osd_scenarios, which is to simply not use 'ceph-volume batch' mode. So in this case I just pass the info about a precreated LVM. When I do that it doesn't hit this issue.

sudo dd if=/dev/zero of=/var/lib/ceph-osd.img bs=1 count=0 seek=7G
sudo losetup /dev/loop3 /var/lib/ceph-osd.img
sudo pvcreate /dev/loop3
sudo vgcreate vg2 /dev/loop3
sudo lvcreate -n data-lv2 -l 597 vg2
sudo lvcreate -n db-lv2 -l 597 vg2
sudo lvcreate -n wal-lv2 -l 597 vg2

and then in my THT pass

parameter_defaults:
  CephAnsibleDisksConfig:
    osd_scenario: lvm
    osd_objectstore: bluestore
    lvm_volumes:
      - data: data-lv2
        data_vg: vg2
        db: db-lv2
        db_vg: vg2
        wal: wal-lv2
        wal_vg: vg2

It worked on my testing VM with a loopback so I'll try having TripleO CI create the LVM structure before running ceph-ansible.

[1] https://github.com/ceph/ceph-ansible/blob/master/docs/source/osds/scenarios.rst#collocated

Comment 4 John Fulton 2019-03-11 18:06:25 UTC

I have received reports of people hitting this issue even when they are not using loopback devices so I am updated the bug title. I have asked them to update this bug with their lsblk output. The issue is more serious if people are hitting it with real disks.

Comment 5 Keith Plant 2019-03-11 18:26:01 UTC

I am seeing the same behavior with 24 real disks, 20 spinning and 4 solid state. Just for the record I am using docker instead of podman.

lsblk is able to determine whether or not the disks are rotating or not:

[heat-admin@overcloud-cephstorage-0 ~]$ lsblk -d -o ROTA $(for i in {a..x}; do echo -n "/dev/sd$i "; done)
ROTA
   0
   0
   0
   0
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1
   1

Comment 17 John Fulton 2019-03-28 21:07:58 UTC

(In reply to Jeremy from comment #15)
> The mentioned workaround https://access.redhat.com/solutions/3954161 says
> "If you don't want to use the ceph-volume batch feature and have direct
> control of what disk gets picked for what, then you may create LVM volumes
> directly on the devices with an OSPd preboot script" .. Could we get that
> script or some directions to give our customers how to do that.

You can have director run any script on first boot as described here:

 https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/advanced_overcloud_customization/chap-configuration_hooks#sect-Customizing_Configuration_on_First_Boot

Instead of having the embedded bash script in the example above echo a line into /etc/resolv.conf, you could have it create LVMs with the lvcreate command something like:

  config: |
    #!/bin/bash
    pvcreate {{ ceph_loop_device }}
    vgcreate {{ ceph_logical_volume_group }} {{ ceph_loop_device }}
    lvcreate -n {{ ceph_logical_volume_wal }} -l 375 {{ ceph_logical_volume_group }}
    lvcreate -n {{ ceph_logical_volume_db }} -l 375 {{ ceph_logical_volume_group }}
    lvcreate -n {{ ceph_logical_volume_data }} -l 1041 {{ ceph_logical_volume_group }}
    lvs

Naturally you'll need to change the sizes and the LVM names based on what you choose. So this example:

parameter_defaults:
  CephAnsibleDisksConfig:
    osd_objectstore: bluestore
    osd_scenario: lvm
    lvm_volumes:
      - data: ceph_lv_data
        data_vg: ceph_vg
        db: ceph_lv_db
        db_vg: ceph_vg
        wal: ceph_lv_wal
        wal_vg: ceph_vg

We could set:

{{ ceph_logical_volume_group }} to ceph_vg
{{ ceph_logical_volume_wal }} to ceph_lv_wal
{{ ceph_logical_volume_data }} to ceph_lv_data
{{ ceph_logical_volume_db }} to ceph_lv_db

That's for ONE pv which would be {{ ceph_loop_device }}. If the devices list is longer the above would need to be expanded.

Comment 20 John Fulton 2019-03-29 03:01:26 UTC

Created attachment 1549277 [details]
Example workaround for OSPd

Comment 48 Andrew Schoen 2019-07-18 14:34:30 UTC

(In reply to Siggy Sigwald from comment #46)
> A message from our customer on the support case:
> 
> Looking at the BZ, it looks like it s targetted for ceph 3.3 - we are
> unfortunatly not able to wait for that to release due to date constraints
> for our release.
> Would you be able to provide us with a way to patch this fix into the
> existing 3.2 , not entirely sure where the fix needs to go, ceph container
> image or the overcloud image, but having this would be much more easier for
> us to implement than the previously proposed workaround where we need to
> "manually" create the vlm's 
> thanks
> 
> Please advice.
> Thanks.

Siggy,

Unfortunately there is just too much change between 3.2 and 3.3 and it is not possible to deliver a simple patch here. It requires many of the changes present in 3.3 to implement a fix.

Thanks,
Andrew

Comment 60 errata-xmlrpc 2019-08-21 15:10:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538

Comment 61 Alfredo Deza 2019-09-26 14:58:53 UTC

*** Bug 1674022 has been marked as a duplicate of this bug. ***

Comment 62 Red Hat Bugzilla 2024-01-06 04:25:58 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.

adakopou
akaris
anharris
arkady_kanevsky
aschoen
assingh
ceph-eng-bugs
ceph-qe-bugs
cschwede
dhill
elicohen
flucifre
gabrioux
gael_rehault
gfidente
gmeno
gsitlani
jmelvin
kholtz
kplantjr
mmuench
nmorell
pgrist
schhabdi
sisadoun
ssigwald
tchandra
tserlin
vashastr