Bug 1914702 - [OSP16.1] Ceph HCI cluster creating block.db volume group on random device on a cluster with identical storage
Summary: [OSP16.1] Ceph HCI cluster creating block.db volume group on random device on...
Keywords:
Status: CLOSED DUPLICATE of bug 1878500
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: ceph-ansible
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Guillaume Abrioux
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-10 22:40 UTC by Vadim Khitrin
Modified: 2021-01-12 13:47 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-12 13:47:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Vadim Khitrin 2021-01-10 22:40:39 UTC
Description of problem:
In a Ceph HCI cluster with similar block devices (type - SSD and model - SSDSC2KG480G8R), for some reason volume groups dedicated to block.db were created on one of the HCI compute nodes. As far as I know (and managed to deploy successfully on previous composes with same hardware and topology), if there are no mix between slower/faster disks (i.e HDDs/SSDs) there should not be a dedicated device to store block.db.

This doesn't appear to be the case in my setup, I'm not sure if this is a misconfiguration, regression (although I don't think so because we have managed to deploy Ceph HCI cluster on our CI using this compose on a 'simpler' topology) or and edge scenario I've hit.

Ceph configuration:
parameter_defaults:
  CephPoolDefaultSize: 2
  CephPoolDefaultPgNum: 64
  CephPools:
    - {"name": backups, "pg_num": 32, "pgp_num": 32, "application": "rbd"}
    - {"name": volumes, "pg_num": 512, "pgp_num": 512, "application": "rbd"}
    - {"name": vms, "pg_num": 128, "pgp_num": 128, "application": "rbd"}
    - {"name": images, "pg_num": 64, "pgp_num": 64, "application": "rbd"}
  CephConfigOverrides:
    osd_recovery_op_priority: 3
    osd_recovery_max_active: 3
    osd_max_backfills: 1
  CephAnsibleExtraConfig:
    nb_retry_wait_osd_up: 60
    delay_wait_osd_up: 20
    is_hci: true
    # 6 OSDs * 2 vCPUs per non nvme SSD = 12 vCPUs (list below not used for VNF)
    # vCPUs from NUMA node 1 will be assigned to Ceph OSD
    ceph_osd_docker_cpuset_cpus: "5,7,9,11,13,15,17,19,23,25,27,29"
    # cpu_limit 0 means no limit as we are limiting CPUs with cpuset above
    ceph_osd_docker_cpu_limit: 0
    # numactl preferred to cross the numa boundary if we have to
    # but try to only use memory from numa node0
    # cpuset-mems would not let it cross numa boundary
    # lots of memory so NUMA boundary crossing unlikely
    ceph_osd_numactl_opts: "-N 1 --preferred=1"
  CephAnsibleDisksConfig:
    # 2 OSD per SSD
    osds_per_device: 2
    osd_scenario: lvm
    osd_objectstore: bluestore
    devices:
      - /dev/sdb
      - /dev/sdc
      - /dev/sdd

lsblk output from properly provisioned node (computehciovndpdksriov-0):
NAME                                                                                                 MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                                                                                    8:0    0 894.3G  0 disk
|-sda1                                                                                                 8:1    0     1M  0 part
`-sda2                                                                                                 8:2    0 894.3G  0 part /
sdb                                                                                                    8:16   0 447.1G  0 disk
|-ceph--ba2e0109--896a--4e31--ae8e--3625d879eafc-osd--data--e2839cb7--2529--4917--b8cc--eb2a07e1a8e2 253:0    0 223.6G  0 lvm
`-ceph--ba2e0109--896a--4e31--ae8e--3625d879eafc-osd--data--bd1f17fa--ad8e--4a25--8052--0251d7145b72 253:1    0 223.6G  0 lvm
sdc                                                                                                    8:32   0 447.1G  0 disk
|-ceph--a0d35eac--0b75--4ed8--b03a--45ad6a4aa694-osd--data--2e39b38b--2bf0--4a8b--b616--0d4c6001cbe1 253:2    0 223.6G  0 lvm
`-ceph--a0d35eac--0b75--4ed8--b03a--45ad6a4aa694-osd--data--ec0168b8--50ad--423b--a85b--fba38b8eb4a6 253:3    0 223.6G  0 lvm
sdd                                                                                                    8:48   0 447.1G  0 disk
|-ceph--5d8407ba--ffdd--4880--b351--c5e3b2b94ea3-osd--data--063ee4d3--1db2--43b1--be82--4c2d3c447b1b 253:4    0 223.6G  0 lvm
`-ceph--5d8407ba--ffdd--4880--b351--c5e3b2b94ea3-osd--data--00daf5d4--67d4--44d9--a322--656b8849563d 253:5    0 223.6G  0 lvm
sr0                                                                                                   11:0    1  1024M  0 rom

lsblk output from not properly provisioned node (computehciovndpdksriov-1):
NAME                                                                                                                  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                                                                                                     8:0    0 894.3G  0 disk
|-sda1                                                                                                                  8:1    0     1M  0 part
`-sda2                                                                                                                  8:2    0 894.3G  0 part /
sdb                                                                                                                     8:16   0 447.1G  0 disk
`-ceph--block--dbs--66661c38--56e6--4381--9d7e--373ea48d4e17-osd--block--db--af9811a4--67dd--425c--9930--5795ca35b41f 253:1    0 447.1G  0 lvm
sdc                                                                                                                     8:32   0 447.1G  0 disk
|-ceph--block--053a8597--99f2--4518--816a--fb1a20cba79a-osd--block--eab71c9d--ce74--48e9--96ed--27c625131d76          253:0    0 223.6G  0 lvm
`-ceph--block--053a8597--99f2--4518--816a--fb1a20cba79a-osd--block--241657ab--9736--4e5d--b3e7--2b1f88203627          253:2    0 223.6G  0 lvm
sdd                                                                                                                     8:48   0 447.1G  0 disk
`-ceph--block--dbs--66661c38--56e6--4381--9d7e--373ea48d4e17-osd--block--db--d7e156a9--41b6--4af9--b5d8--da09ee53aa61 253:3    0 447.1G  0 lvm
sr0                                                                                                                    11:0    1  1024M  0 rom

vgs output from properly provisioned node:
  VG                                        Attr   Ext   #PV #LV #SN VSize    VFree VG UUID                                VProfile #VMda VMdaFree  VMdaSize  #VMdaUse VG Tags
  ceph-5d8407ba-ffdd-4880-b351-c5e3b2b94ea3 wz--n- 4.00m   1   2   0 <447.13g 4.00m LAPsiR-Ez5K-3orI-zNzP-TdSt-8b37-1z1hVX              1   506.50k  1020.00k        1
  ceph-a0d35eac-0b75-4ed8-b03a-45ad6a4aa694 wz--n- 4.00m   1   2   0 <447.13g 4.00m hFGJX1-ssAN-CO8d-j8aq-3mFT-IoGX-TXI37E              1   506.50k  1020.00k        1
  ceph-ba2e0109-896a-4e31-ae8e-3625d879eafc wz--n- 4.00m   1   2   0 <447.13g 4.00m lNcPeu-FxHt-ffP0-xs7h-fz0o-tBP3-M4Ua30              1   506.50k  1020.00k        1
  Reloading config files

vgs output from not properly provisioned node:
  VG                                                  Attr   Ext   #PV #LV #SN VSize    VFree VG UUID                                VProfile #VMda VMdaFree  VMdaSize  #VMdaUse VG Tags
  ceph-block-053a8597-99f2-4518-816a-fb1a20cba79a     wz--n- 4.00m   1   2   0 <447.13g 4.00m 24YewP-knRa-WdaQ-w9im-v3wF-vzAc-c5iVLb              1   506.00k  1020.00k        1
  ceph-block-dbs-66661c38-56e6-4381-9d7e-373ea48d4e17 wz--n- 4.00m   2   2   0 <894.26g    0  7GinLN-Sgy1-UQgd-CLsO-5JWZ-aj6X-uLh2kQ              2   506.00k  1020.00k        2
  Reloading config files

Topology consists of:
3 ControllerSriov nodes
2 ComputeHCIOvsDpdkSriov (custom role based on ComputeHCIOvsDpdk which has the required SR-IOV resources enabled, we have verified this role to work on the same setup that was successfully deployed in CI)
1 ComputeOvsDpdkSriov (Non Ceph HCI compute node, also the deployment is failing if we do not use this node)
As mentioned earlier, we were able to deploy Ceph HCI on this topology above on earlier composes.

Version-Release number of selected component (if applicable):
compose: RHOS-16.1-RHEL-8-20201214.n.3 (also encountered in 16.1.2 - RHOS-16.1-RHEL-8-20201021.n.0 with same release of ceph-ansible)
ceph-ansible version: ceph-ansible-4.0.31-1.el8cp.noarch (also encountered with ceph-ansible-4.0.25.2-1.el8cp.noarch)
uname -a output on affected host: Linux computehciovndpdksriov-1 4.18.0-193.29.1.el8_2.x86_64 #1 SMP Thu Oct 22 10:09:53 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
100% of the several attempts I've done

Steps to Reproduce:
1. Attempt to deploy Ceph HCI deployment.

Actual results:
Deployment fails.

Expected results:
Deployment is successful.

Additional info:
Will attach sosreport logs in comment.

Comment 7 John Fulton 2021-01-12 13:47:21 UTC
A version of ceph-volume which fixes this problem will be available in the fixed in version of bug 1878500.

This bug is in ceph container 4-36 [1] which is based on 14.2.8
A new ceph container based on 14.2.11 will be released with Ceph 4.2 as tracked in bz 1878500.
If you deploy your overcloud using the new container coming from bz 1878500 then you shouldn't have this issue.

[1] https://catalog.redhat.com/software/containers/rhceph/rhceph-4-rhel8/5e39df7cd70cc54b02baf33f

*** This bug has been marked as a duplicate of bug 1878500 ***


Note You need to log in before you can comment on or make changes to this bug.