Bug 1793542 - ceph-volume lvm batch errors on OSD systems w/HDDs and multiple NVMe devices
Summary: ceph-volume lvm batch errors on OSD systems w/HDDs and multiple NVMe devices
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 4.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: 4.1
Assignee: Rishabh Dave
QA Contact: Vasishta
URL:
Whiteboard:
: 1848556 (view as bug list)
Depends On:
Blocks: 1750994
TreeView+ depends on / blocked
 
Reported: 2020-01-21 14:31 UTC by Tim Wilkinson
Modified: 2020-07-27 13:16 UTC (History)
17 users (show)

Fixed In Version: ceph-ansible-4.0.20-1.el8, ceph-ansible-4.0.20-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-08 21:30:04 UTC
Target Upstream Version:


Attachments (Terms of Use)
Installer Logs (242.87 KB, application/zip)
2020-07-08 19:40 UTC, ravic
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 5262 0 None closed library/ceph_volume: look for error messages in stderr 2021-02-13 14:39:01 UTC
Red Hat Product Errata RHSA-2020:2231 0 None None None 2020-05-19 17:32:51 UTC

Comment 47 errata-xmlrpc 2020-05-19 17:32:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:2231

Comment 48 ravic 2020-07-08 19:34:34 UTC
I am seeing this exact issue in my environment with the latest Ceph version 14.2.8-59.el8cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)


Step1 
Installed 3 Mon and 4 OSD nodes with the 23 SAS3 spinning drives( 12TB) drives and 1 Intel Optane NVMe for the MetaData

ok: [mon1 -> mon1] => 
  msg:
  - '  cluster:'
  - '    id:     8182b4ca-ecce-49e8-a98d-c430904eaf7a'
  - '    health: HEALTH_OK'
  - ' '
  - '  services:'
  - '    mon: 3 daemons, quorum mon1,mon2,mon3 (age 11m)'
  - '    mgr: mon3(active, since 24s), standbys: mon1, mon2'
  - '    osd: 92 osds: 92 up (since 3m), 92 in (since 3m)'
  - ' '
  - '  data:'
  - '    pools:   0 pools, 0 pgs'
  - '    objects: 0 objects, 0 B'
  - '    usage:   94 GiB used, 1004 TiB / 1004 TiB avail'
  - '    pgs:     '
  - ' '

PLAY RECAP **************************************************************************************************************************************************
admin                      : ok=141  changed=6    unreachable=0    failed=0    skipped=242  rescued=0    ignored=0   
mon1                       : ok=280  changed=26   unreachable=0    failed=0    skipped=413  rescued=0    ignored=0   
mon2                       : ok=231  changed=17   unreachable=0    failed=0    skipped=367  rescued=0    ignored=0   
mon3                       : ok=239  changed=19   unreachable=0    failed=0    skipped=366  rescued=0    ignored=0   
osd1                       : ok=173  changed=17   unreachable=0    failed=0    skipped=269  rescued=0    ignored=0   
osd2                       : ok=162  changed=15   unreachable=0    failed=0    skipped=261  rescued=0    ignored=0   
osd3                       : ok=162  changed=15   unreachable=0    failed=0    skipped=261  rescued=0    ignored=0   
osd4                       : ok=164  changed=15   unreachable=0    failed=0    skipped=259  rescued=0    ignored=0   


INSTALLER STATUS ********************************************************************************************************************************************
Install Ceph Monitor           : Complete (0:01:01)
Install Ceph Manager           : Complete (0:00:57)
Install Ceph OSD               : Complete (0:06:58)
Install Ceph Dashboard         : Complete (0:01:04)
Install Ceph Grafana           : Complete (0:00:31)
Install Ceph Node Exporter     : Complete (0:01:15)


Step 2 

Added 3 Ceph clients with the --limit clinets option

PLAY RECAP **************************************************************************************************************************************************
client1                    : ok=136  changed=8    unreachable=0    failed=0    skipped=274  rescued=0    ignored=0   
client2                    : ok=105  changed=4    unreachable=0    failed=0    skipped=226  rescued=0    ignored=0   
client3                    : ok=105  changed=4    unreachable=0    failed=0    skipped=226  rescued=0    ignored=0   


INSTALLER STATUS ********************************************************************************************************************************************
Install Ceph Client            : Complete (0:00:29)
Install Ceph Node Exporter     : Complete (0:00:31)

Step3 

Added RGWS on all 4 OSD nodes with the --limit rgws option, failed with the following error for each OSD node @eph-volume lvm batch to create bluestore osds

TASK [ceph-osd : use ceph-volume lvm batch to create bluestore osds] ****************************************************************************************
Wednesday 08 July 2020  14:40:13 -0400 (0:00:00.899)       0:05:08.220 ******** 

fatal: [osd2]: FAILED! => changed=true 
  cmd:
  - podman
  - run
  - --rm
  - --privileged
  - --net=host
  - --ipc=host
  - --ulimit
  - nofile=1024:4096
  - -v
  - /run/lock/lvm:/run/lock/lvm:z
  - -v
  - /var/run/udev/:/var/run/udev/:z
  - -v
  - /dev:/dev
  - -v
  - /etc/ceph:/etc/ceph:z
  - -v
  - /run/lvm/:/run/lvm/
  - -v
  - /var/lib/ceph/:/var/lib/ceph/:z
  - -v
  - /var/log/ceph/:/var/log/ceph/:z
  - --entrypoint=ceph-volume
  - registry.redhat.io/rhceph/rhceph-4-rhel8:latest
  - --cluster
  - ceph
  - lvm
  - batch
  - --bluestore
  - --yes
  - --prepare
  - /dev/sdc
  - /dev/sdd
  - /dev/sde
  - /dev/sdf
  - /dev/sdg
  - /dev/sdh
  - /dev/sdi
  - /dev/sdj
  - /dev/sdk
  - /dev/sdl
  - /dev/sdm
  - /dev/sdn
  - /dev/sdo
  - /dev/sdp
  - /dev/sdq
  - /dev/sdr
  - /dev/sds
  - /dev/sdt
  - /dev/sdu
  - /dev/sdv
  - /dev/sdw
  - /dev/sdx
  - /dev/sdy
  - --wal-devices
  - /dev/nvme0n1
  - --report
  - --format=json
  msg: non-zero return code
  rc: 1
  stderr: |-
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    WARNING: The same type, major and minor should not be used for multiple devices.
    -->  RuntimeError: 1 devices were filtered in non-interactive mode, bailing out
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>



Here are my system details.



Ansible Machine

NAME="Red Hat Enterprise Linux"
VERSION="8.2 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.2"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.2 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.2:GA"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.2"


[rhel@admin ceph-ansible]$ uname -a
Linux admin.ceph.local 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Mon Jun 1 20:24:55 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


[rhel@admin ceph-ansible]$ ansible-playbook --version
ansible-playbook 2.8.12
  config file = /usr/share/ceph-ansible/ansible.cfg
  configured module search path = ['/usr/share/ceph-ansible/library']
  ansible python module location = /usr/lib/python3.6/site-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.6.8 (default, Dec  5 2019, 15:45:45) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]

  [rhel@admin ceph-ansible]$ podman version
Version:            1.6.4
RemoteAPI Version:  1
Go Version:         go1.13.4
OS/Arch:            linux/amd64
[rhel@admin ceph-ansible]$



Cluster Machine

NAME="Red Hat Enterprise Linux"
VERSION="8.2 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.2"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.2 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.2:GA"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.2"



[rhel@osd1 ~]$ uname -a
Linux osd1.ceph.local 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Mon Jun 1 20:24:55 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux


[root@mon1 ~]# podman exec ceph-mon-mon1 ceph -v
ceph version 14.2.8-59.el8cp (53387608e81e6aa2487c952a604db06faa5b2cd0) nautilus (stable)

Comment 49 ravic 2020-07-08 19:40:47 UTC
Created attachment 1700348 [details]
Installer Logs

Attaching installer logs.

Comment 50 Dimitri Savineau 2020-07-08 21:30:04 UTC
Please open a dedicated bugzilla for this as the error you mentioned isn't the same than the one reported in this bugzilla.

This issue was about "Aborting because strategy changed from bluestore.MixedType" from the ceph-volume command, managed by the ceph-volume ansible module and fixed in ceph-ansible.

Your issue is completely different "RuntimeError: 1 devices were filtered in non-interactive mode, bailing out" which is coming from ceph-volume command directly.

So it could be something like https://bugzilla.redhat.com/show_bug.cgi?id=1854326

Comment 51 John Fulton 2020-07-27 13:16:01 UTC
*** Bug 1848556 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.