Bug 1650306

Summary:

unable to use ceph-volume lvm batch on OSD systems w/HDDs and multiple NVMe devices

Product:

[Red Hat Storage] Red Hat Ceph Storage

Reporter:

John Harrigan <jharriga>

Component:

Ceph-Volume

Assignee:

Alfredo Deza <adeza>

Status:

CLOSED ERRATA

QA Contact:

Tiffany Nguyen <tunguyen>

Severity:

high

Docs Contact:

Priority:

high

Version:

3.2

CC:

adeza, agunn, aschoen, bengland, ceph-eng-bugs, ceph-qe-bugs, dfuller, gmeno, hnallurv, jbrier, jharriga, kdreyer, mhackett, pasik, seb, shan, tserlin, tunguyen, vakulkar, vashastr

Target Milestone:

Target Release:

3.2

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

RHEL: ceph-ansible-3.2.0-0.1.rc5.el7cp Ubuntu: ceph-ansible_3.2.0~rc5-2redhat1

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-01-03 19:02:22 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1641792

Attachments:

Description	Flags
ceph-ansible runlog	none
text 'ceph lvm batch' report	none
ceph volume log	none
purge runlog using ceph-ansibleRC2	none
deploy runlog using ceph-ansibleRC2	none
Success using ceph-ansibleRC2 w/one NVMe	none
ansible -vv using ceph-ansibleRC4 - FAILED	none
ansible runlog using -vvv	none

Description John Harrigan 2018-11-15 19:21:01 UTC

Created attachment 1506222 [details]
ceph-ansible runlog

Description of problem:
ceph-ansible deployment fails when specifying multiple NVMe devices 

Version-Release number of selected component (if applicable):
  RHEL 7.6
  Ceph Version:  12.2.8-34.el7cp
  Ceph Ansible Version:  ceph-ansible-3.2.0-0.1.rc1.el7cp.noarch
  ceph-volume                 Version: 1.0.0

Steps to Reproduce:
1. Running on Supermicro 6048r systems with 36x HDDs and two NVMe devices
2. Specified this in osds.yml (limited cfg to 4x HDDs and 2x NVMe)
osd_objectstore: bluestore
# use 'ceph-volume lvm batch' mode
osd_scenario: lvm
devices:
  - /dev/sdc
  - /dev/sdd
  - /dev/nvme0n1
  - /dev/sdq
  - /dev/sdr
  - /dev/nvme1n1

3. # ansible-playbook site.yml 2>&1 | tee -a Deploy2nvme.Nov15

Actual results:
ceph-ansible fails with... (see complete in attached logfile)
TASK [ceph-config : run 'ceph-volume lvm batch --report' to see how many osds are to be created] ***
Thursday 15 November 2018  19:11:38 +0000 (0:00:01.628)       0:34:45.946 ***** 
fatal: [c07-h01-6048r]: FAILED! => {"changed": true, "cmd": ["ceph-volume", "--cluster", "ceph", "lvm", "batch", "--bluestore", "--yes", "/dev/sdc", "/dev/sdd", "/dev/nvme0n1", "/dev/sdq", "/dev/sdr", "/dev/nvme1n1", "--report", "--format=json"], "msg": "non-zero return code", "rc": 1, "stderr": "", "stderr_lines": [], "stdout": "--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering", "stdout_lines": ["--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering"]}

Expected results:
ceph-ansible deploys cluster with two HDDs paired to each of the two NVMe devices

Additional info:  <-- from one of the OSD systems after ceph-ansible run

# ssh c07-h01-6048r ceph-volume lvm list

====== osd.0 =======

  [block]    /dev/ceph-block-1df28d7e-c3d9-47e6-9d30-71ff1ec22128/osd-block-6d15f9de-ef13-4eb6-8a4e-d39366072bd9

      type                      block
      osd id                    0
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  96d9a471-de1b-44ed-9cf0-7dc1c688edf9
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-665bfc0b-3c5d-4167-a7a1-1915dcdb625b
      encrypted                 0
      db uuid                   6exXcU-vBo1-AeHs-JBGY-a5Uz-v3Q7-2sQc2Y
      cephx lockbox secret      
      block uuid                UyafDc-Y7UQ-WCCL-sD7O-8S5B-CI9w-7NjLVz
      block device              /dev/ceph-block-1df28d7e-c3d9-47e6-9d30-71ff1ec22128/osd-block-6d15f9de-ef13-4eb6-8a4e-d39366072bd9
      vdo                       0
      crush device class        None
      devices                   /dev/sdc

  [  db]    /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-665bfc0b-3c5d-4167-a7a1-1915dcdb625b

      type                      db
      osd id                    0
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  96d9a471-de1b-44ed-9cf0-7dc1c688edf9
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-665bfc0b-3c5d-4167-a7a1-1915dcdb625b
      encrypted                 0
      db uuid                   6exXcU-vBo1-AeHs-JBGY-a5Uz-v3Q7-2sQc2Y
      cephx lockbox secret      
      block uuid                UyafDc-Y7UQ-WCCL-sD7O-8S5B-CI9w-7NjLVz
      block device              /dev/ceph-block-1df28d7e-c3d9-47e6-9d30-71ff1ec22128/osd-block-6d15f9de-ef13-4eb6-8a4e-d39366072bd9
      vdo                       0
      crush device class        None
      devices                   /dev/nvme0n1

====== osd.26 ======

  [block]    /dev/ceph-block-1108ff83-a82a-466a-94f5-7b51eb6061e7/osd-block-0fa15758-5870-4df3-8d24-237673c995e6

      type                      block
      osd id                    26
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  7840a23f-9829-4c7e-a401-c38da530ab8b
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-f4cdcf01-a6c5-4620-b096-7f2d8d1afd12
      encrypted                 0
      db uuid                   5MyXhu-V3sj-B3CR-1Pes-VWxM-jGbu-w8APQD
      cephx lockbox secret      
      block uuid                hww6LU-KT3L-eSiK-iSAC-xxDt-Y5Js-6s1v3c
      block device              /dev/ceph-block-1108ff83-a82a-466a-94f5-7b51eb6061e7/osd-block-0fa15758-5870-4df3-8d24-237673c995e6
      vdo                       0
      crush device class        None
      devices                   /dev/sdq

  [  db]    /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-f4cdcf01-a6c5-4620-b096-7f2d8d1afd12

      type                      db
      osd id                    26
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  7840a23f-9829-4c7e-a401-c38da530ab8b
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-f4cdcf01-a6c5-4620-b096-7f2d8d1afd12
      encrypted                 0
      db uuid                   5MyXhu-V3sj-B3CR-1Pes-VWxM-jGbu-w8APQD
      cephx lockbox secret      
      block uuid                hww6LU-KT3L-eSiK-iSAC-xxDt-Y5Js-6s1v3c
      block device              /dev/ceph-block-1108ff83-a82a-466a-94f5-7b51eb6061e7/osd-block-0fa15758-5870-4df3-8d24-237673c995e6
      vdo                       0
      crush device class        None
      devices                   /dev/nvme1n1

====== osd.12 ======

  [block]    /dev/ceph-block-69ac31a2-65e2-40f2-84d9-0f00720e03c9/osd-block-618522db-46b0-4b24-aec5-cc5cee180210

      type                      block
      osd id                    12
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  0cfdf0cd-854f-4a81-b433-b7b7b5b164dd
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-d2f397c7-3d0a-4a19-bac9-bb23164a6b5c
      encrypted                 0
      db uuid                   1tejrb-6XpV-0XIO-ybMW-wTEs-ejmL-b23R2d
      cephx lockbox secret      
      block uuid                UPIHR7-125E-L9lG-1501-GY5s-eZnU-cZRk6N
      block device              /dev/ceph-block-69ac31a2-65e2-40f2-84d9-0f00720e03c9/osd-block-618522db-46b0-4b24-aec5-cc5cee180210
      vdo                       0
      crush device class        None
      devices                   /dev/sdd

  [  db]    /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-d2f397c7-3d0a-4a19-bac9-bb23164a6b5c

      type                      db
      osd id                    12
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  0cfdf0cd-854f-4a81-b433-b7b7b5b164dd
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-d2f397c7-3d0a-4a19-bac9-bb23164a6b5c
      encrypted                 0
      db uuid                   1tejrb-6XpV-0XIO-ybMW-wTEs-ejmL-b23R2d
      cephx lockbox secret      
      block uuid                UPIHR7-125E-L9lG-1501-GY5s-eZnU-cZRk6N
      block device              /dev/ceph-block-69ac31a2-65e2-40f2-84d9-0f00720e03c9/osd-block-618522db-46b0-4b24-aec5-cc5cee180210
      vdo                       0
      crush device class        None
      devices                   /dev/nvme0n1

====== osd.37 ======

  [block]    /dev/ceph-block-015114af-dc99-472f-8a11-5abe40fa780e/osd-block-c70b2ad9-3101-491a-a2e0-a4ac45c4bad0

      type                      block
      osd id                    37
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  87c7553e-d27c-4e77-a1f2-bb284bbc18ce
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-ef348a39-2d05-4b96-9b00-9f32b0053d20
      encrypted                 0
      db uuid                   RJEKem-BtyO-hz75-nu22-pCFI-XESq-KkoFwI
      cephx lockbox secret      
      block uuid                IdeGHL-ZhPl-uoqu-m3oB-A9l8-IO2W-2ovKok
      block device              /dev/ceph-block-015114af-dc99-472f-8a11-5abe40fa780e/osd-block-c70b2ad9-3101-491a-a2e0-a4ac45c4bad0
      vdo                       0
      crush device class        None
      devices                   /dev/sdr

  [  db]    /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-ef348a39-2d05-4b96-9b00-9f32b0053d20

      type                      db
      osd id                    37
      cluster fsid              a0b25557-9b93-48bc-b23d-7b6ae75c46eb
      cluster name              ceph
      osd fsid                  87c7553e-d27c-4e77-a1f2-bb284bbc18ce
      db device                 /dev/ceph-block-dbs-f9277f5e-9b73-4e41-805e-b9c07d09e594/osd-block-db-ef348a39-2d05-4b96-9b00-9f32b0053d20
      encrypted                 0
      db uuid                   RJEKem-BtyO-hz75-nu22-pCFI-XESq-KkoFwI
      cephx lockbox secret      
      block uuid                IdeGHL-ZhPl-uoqu-m3oB-A9l8-IO2W-2ovKok
      block device              /dev/ceph-block-015114af-dc99-472f-8a11-5abe40fa780e/osd-block-c70b2ad9-3101-491a-a2e0-a4ac45c4bad0
      vdo                       0
      crush device class        None
      devices                   /dev/nvme1n1

# ssh c07-h01-6048r lsblk
NAME                                                                                                                  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sdc                                                                                                                     8:32   0   1.8T  0 disk  
└─ceph--block--1df28d7e--c3d9--47e6--9d30--71ff1ec22128-osd--block--6d15f9de--ef13--4eb6--8a4e--d39366072bd9          253:0    0   1.8T  0 lvm   
sdd                                                                                                                     8:48   0   1.8T  0 disk  
└─ceph--block--69ac31a2--65e2--40f2--84d9--0f00720e03c9-osd--block--618522db--46b0--4b24--aec5--cc5cee180210          253:2    0   1.8T  0 lvm   
  
sdq                                                                                                                    65:0    0   1.8T  0 disk  
└─ceph--block--1108ff83--a82a--466a--94f5--7b51eb6061e7-osd--block--0fa15758--5870--4df3--8d24--237673c995e6          253:4    0   1.8T  0 lvm   
sdr                                                                                                                    65:16   0   1.8T  0 disk  
└─ceph--block--015114af--dc99--472f--8a11--5abe40fa780e-osd--block--c70b2ad9--3101--491a--a2e0--a4ac45c4bad0          253:6    0   1.8T  0 lvm   
 
nvme0n1                                                                                                               259:1    0 745.2G  0 disk  
├─ceph--block--dbs--f9277f5e--9b73--4e41--805e--b9c07d09e594-osd--block--db--665bfc0b--3c5d--4167--a7a1--1915dcdb625b 253:1    0   372G  0 lvm   
└─ceph--block--dbs--f9277f5e--9b73--4e41--805e--b9c07d09e594-osd--block--db--d2f397c7--3d0a--4a19--bac9--bb23164a6b5c 253:3    0   372G  0 lvm   
nvme1n1                                                                                                               259:0    0 745.2G  0 disk  
├─ceph--block--dbs--f9277f5e--9b73--4e41--805e--b9c07d09e594-osd--block--db--f4cdcf01--a6c5--4620--b096--7f2d8d1afd12 253:5    0   372G  0 lvm   
└─ceph--block--dbs--f9277f5e--9b73--4e41--805e--b9c07d09e594-osd--block--db--ef348a39--2d05--4b96--9b00--9f32b0053d20 253:7    0   372G  0 lvm

Comment 3 Alfredo Deza 2018-11-15 20:08:18 UTC

@Jhon the message: 

>  Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering

Means that ceph-volume looked at the devices and determined that the strategy would be to place data in the spinning drives and block.db on the NVMe devices.

However, *filtering* happened, which I am assuming it removed/filtered-out the NVMe devices, leaving just the spinning ones. This change is detected and it correctly refuses to continue since the end result would be to ignore the NVMe devices and consuming the spinning drives fully.

There are a few reasons why the NVMe devices were excluded, but could you run the report command to get more information? For example:

> ceph-volume --cluster ceph lvm batch --bluestore --yes /dev/sdc /dev/sdd /dev/nvme0n1 /dev/sdq /dev/sdr /dev/nvme1n1 --report


And just in case the json report is richer:

> ceph-volume --cluster ceph lvm batch --bluestore --yes /dev/sdc /dev/sdd /dev/nvme0n1 /dev/sdq /dev/sdr /dev/nvme1n1 --report --format=json

Comment 4 John Harrigan 2018-11-15 20:24:03 UTC

Created attachment 1506227 [details]
text 'ceph lvm batch' report

Comment 5 John Harrigan 2018-11-15 20:27:04 UTC

(In reply to John Harrigan from comment #4)
> Created attachment 1506227 [details]
> text 'ceph lvm batch' report

Both reports failed to provide any information
$ cat report.txt 
--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering

$ cat report.json
--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering

Comment 6 Alfredo Deza 2018-11-15 20:31:09 UTC

John, that sounds like something we need to improve, thanks for catching that. Can you add the /var/log/ceph/ceph-volume.log file to this ticket? The reasons for filtering the devices should all be logged there

Comment 7 John Harrigan 2018-11-15 20:57:43 UTC

the ceph-ansible run leaves this cluster, all HDD based.
How should I specify the configuration in osds.yml to get the intended
configuration of ceph-ansible deploys bluestore cluster with two HDDs paired to each of the two NVMe devices

# ceph osd tree
ID  CLASS WEIGHT   TYPE NAME              STATUS REWEIGHT PRI-AFF 
 -1       87.30176 root default                                   
-17        7.27515     host c05-h29-6048r                         
  7   hdd  1.81879         osd.7              up  1.00000 1.00000 
 19   hdd  1.81879         osd.19             up  1.00000 1.00000 
 31   hdd  1.81879         osd.31             up  1.00000 1.00000 
 43   hdd  1.81879         osd.43             up  1.00000 1.00000 
-21        7.27515     host c06-h01-6048r                         
  8   hdd  1.81879         osd.8              up  1.00000 1.00000 
 20   hdd  1.81879         osd.20             up  1.00000 1.00000 
 32   hdd  1.81879         osd.32             up  1.00000 1.00000 
 45   hdd  1.81879         osd.45             up  1.00000 1.00000 
-19        7.27515     host c06-h05-6048r                         
  9   hdd  1.81879         osd.9              up  1.00000 1.00000 
 21   hdd  1.81879         osd.21             up  1.00000 1.00000 
 33   hdd  1.81879         osd.33             up  1.00000 1.00000 
 46   hdd  1.81879         osd.46             up  1.00000 1.00000 
-25        7.27515     host c06-h09-6048r                         
 10   hdd  1.81879         osd.10             up  1.00000 1.00000 
 22   hdd  1.81879         osd.22             up  1.00000 1.00000 
 34   hdd  1.81879         osd.34             up  1.00000 1.00000 
 44   hdd  1.81879         osd.44             up  1.00000 1.00000 
-23        7.27515     host c06-h13-6048r                         
 11   hdd  1.81879         osd.11             up  1.00000 1.00000 
 23   hdd  1.81879         osd.23             up  1.00000 1.00000 
 35   hdd  1.81879         osd.35             up  1.00000 1.00000 
 47   hdd  1.81879         osd.47             up  1.00000 1.00000 
 -3        7.27515     host c07-h01-6048r                         
  0   hdd  1.81879         osd.0              up  1.00000 1.00000 
 12   hdd  1.81879         osd.12             up  1.00000 1.00000 
 26   hdd  1.81879         osd.26             up  1.00000 1.00000 
 37   hdd  1.81879         osd.37             up  1.00000 1.00000 
 -7        7.27515     host c07-h05-6048r                         
  3   hdd  1.81879         osd.3              up  1.00000 1.00000 
 13   hdd  1.81879         osd.13             up  1.00000 1.00000 
 25   hdd  1.81879         osd.25             up  1.00000 1.00000 
 38   hdd  1.81879         osd.38             up  1.00000 1.00000 
 -5        7.27515     host c07-h09-6048r                         
  1   hdd  1.81879         osd.1              up  1.00000 1.00000 
 14   hdd  1.81879         osd.14             up  1.00000 1.00000 
 24   hdd  1.81879         osd.24             up  1.00000 1.00000 
 36   hdd  1.81879         osd.36             up  1.00000 1.00000 
-15        7.27515     host c07-h13-6048r                         
  2   hdd  1.81879         osd.2              up  1.00000 1.00000 
 15   hdd  1.81879         osd.15             up  1.00000 1.00000 
 30   hdd  1.81879         osd.30             up  1.00000 1.00000 
 40   hdd  1.81879         osd.40             up  1.00000 1.00000 
-11        7.27515     host c07-h17-6048r                         
  4   hdd  1.81879         osd.4              up  1.00000 1.00000 
 16   hdd  1.81879         osd.16             up  1.00000 1.00000 
 27   hdd  1.81879         osd.27             up  1.00000 1.00000 
 41   hdd  1.81879         osd.41             up  1.00000 1.00000 
 -9        7.27515     host c07-h21-6048r                         
  5   hdd  1.81879         osd.5              up  1.00000 1.00000 
 17   hdd  1.81879         osd.17             up  1.00000 1.00000 
 28   hdd  1.81879         osd.28             up  1.00000 1.00000 
 42   hdd  1.81879         osd.42             up  1.00000 1.00000 
-13        7.27515     host c07-h25-6048r                         
  6   hdd  1.81879         osd.6              up  1.00000 1.00000 
 18   hdd  1.81879         osd.18             up  1.00000 1.00000 
 29   hdd  1.81879         osd.29             up  1.00000 1.00000 
 39   hdd  1.81879         osd.39             up  1.00000 1.00000

Comment 8 John Harrigan 2018-11-15 20:58:19 UTC

Created attachment 1506268 [details]
ceph volume log

Comment 9 Alfredo Deza 2018-11-15 21:14:20 UTC

What configuration are you using so that cehp-ansible deploys only to the HDDs?

Comment 10 John Harrigan 2018-11-16 00:38:15 UTC

I extended the /root/wipefs_6048r.sh script to include this
  dd if=/dev/zero of=/dev/$device bs=1M count=1

Then ran this sequence of cmds and got the same result:
  # ansible-playbook purge-cluster.yml 
  # ansible osds -m script -a "/root/wipefs_6048.sh"
  # ansible-playbook site.yml 2>&1 | tee -a Deploy2nvme.Nov15zap

TASK [ceph-config : run 'ceph-volume lvm batch --report' to see how many osds are to be created] ***
Friday 16 November 2018  00:02:14 +0000 (0:00:02.430)       0:34:58.752 ******* 
fatal: [c07-h01-6048r]: FAILED! => {"changed": true, "cmd": ["ceph-volume", "--cluster", "ceph", "lvm", "batch", "--bluestore", "--yes", "/dev/sdc", "/dev/sdd", "/dev/nvme0n1", "/dev/sdq", "/dev/sdr", "/dev/nvme1n1", "--report", "--format=json"], "msg": "non-zero return code", "rc": 1, "stderr": "", "stderr_lines": [], "stdout": "--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering", "stdout_lines": ["--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering"]}

so no change there.

Next up I will update to the latest ceph-ansible since this commit
  purge-cluster: zap devices used with the lvm scenario
  https://github.com/ceph/ceph-ansible/commit
        /9747f3dbd5a2eada543a6f61e482e005b6660016
is in ceph-ansible-3.2.0-0.1.rc2.el7cp.noarch.rpm.

Comment 11 John Harrigan 2018-11-16 16:13:55 UTC

I upgraded to ceph-ansible-rc2, ran the purge-cluster and deploy.
Unfortunately I had the same result, an early
exit from ceph-ansible with this message

TASK [ceph-config : run 'ceph-volume lvm batch --report' to see how many osds are to be created] ***
Friday 16 November 2018  15:54:33 +0000 (0:00:01.615)       0:34:51.107 ******* 
fatal: [c07-h01-6048r]: FAILED! => {"changed": true, "cmd": ["ceph-volume", "--cluster", "ceph", "lvm", "batch", "--bluestore", "--yes", "/dev/sdc", "/dev/sdd", "/dev/nvme0n1", "/dev/sdq", "/dev/sdr", "/dev/nvme1n1", "--report", "--format=json"], "msg": "non-zero return code", "rc": 1, "stderr": "", "stderr_lines": [], "stdout": "--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering", "stdout_lines": ["--> Aborting because strategy changed from bluestore.MixedType to bluestore.SingleType after filtering"]}

I have attached the logs from the purge and deploy, both performed using rc2 version of ceph-ansible.

Here are the cmds I used to perform this test
---------------------------------------------
# cd /root; wget http://download.eng.bos.redhat.com/composes/auto/ceph-3.2-rhel-7/RHCEPH-3.2-RHEL-7-20181115.ci.1/compose/Tools/x86_64/os/Packages/ceph-ansible-3.2.0-0.1.rc2.el7cp.noarch.rpm
# rpm -Uvh ceph-ansible-3.2.0-0.1.rc2.el7cp.noarch.rpm
# yum list ceph-ansible
  ceph-ansible.noarch                3.2.0-0.1.rc2.el7cp
Purge and redeploy using ceph-ansible.rc2
# ssh c07-h01-6048r ceph -s        ← 48 OSDs
# ansible-playbook purge-cluster.yml 2>&1 | tee -a PurgeRC2.Nov16
# ansible-playbook site.yml 2>&1 | tee -a DeployRC2.Nov16

Comment 12 John Harrigan 2018-11-16 16:14:42 UTC

Created attachment 1506461 [details]
purge runlog using ceph-ansibleRC2

Comment 13 John Harrigan 2018-11-16 16:15:23 UTC

Created attachment 1506462 [details]
deploy runlog using ceph-ansibleRC2

Comment 14 John Harrigan 2018-11-16 19:43:26 UTC

I decided to try this same sequence of cmds but this time only specify ONE NVMe device. 

Sure enough, the deploy work like a charm.

# yum list ceph-ansible
  ceph-ansible.noarch                3.2.0-0.1.rc2.el7cp
# ansible-playbook purge-cluster.yml
# ansible-playbook site.yml 2>&1 | tee -a DeployOneNVMe.Nov16
No Errors
TASK [show ceph status for cluster ceph] ***************************************
Friday 16 November 2018  19:37:58 +0000 (0:00:00.542)       0:55:45.555 ******* 
ok: [c05-h33-6018r -> c05-h33-6018r] => {
    "msg": [
        "  cluster:", 
        "    id:     2f9e9148-125e-4783-ab30-6fcd121aca01", 
        "    health: HEALTH_OK", 
        " ", 
        "  services:", 
        "    mon: 3 daemons, quorum c05-h33-6018r,c06-h29-6018r,c07-h29-6018r", 
        "    mgr: c07-h30-6018r(active)", 
        "    osd: 144 osds: 144 up, 144 in", 
        "    rgw: 12 daemons active", 
        " ", 
        "  data:", 
        "    pools:   4 pools, 32 pgs", 
        "    objects: 209 objects, 12.1KiB", 
        "    usage:   147GiB used, 262TiB / 262TiB avail", 
        "    pgs:     32 active+clean", 
        " "
    ]
}

INSTALLER STATUS ***************************************************************
Install Ceph Monitor        : Complete (0:05:10)
Install Ceph Manager        : Complete (0:03:38)
Install Ceph OSD            : Complete (0:16:18)
Install Ceph RGW            : Complete (0:05:06)
Install Ceph Client         : Complete (0:18:34)

---------------------------------
The osds.yml file looks like this
---------------------------------
osd_objectstore: bluestore
# use 'ceph-volume lvm batch' mode
osd_scenario: lvm
devices:
  - /dev/sdc
  - /dev/sdd
  - /dev/sde
  - /dev/sdf
  - /dev/sdg
  - /dev/sdh
  - /dev/sdi
  - /dev/sdj
  - /dev/sdk
  - /dev/sdl
  - /dev/sdm
  - /dev/sdn
  - /dev/nvme0n1

Comment 15 John Harrigan 2018-11-16 19:47:52 UTC

Created attachment 1506531 [details]
Success using ceph-ansibleRC2 w/one NVMe

Comment 16 Alfredo Deza 2018-11-16 20:03:46 UTC

When using two NVMe devices, in the case that fails, you've mentioned that the steps are:

* purge         -> ansible-playbook purge-cluster.yml 
* wipefs script -> ansible osds -m script -a "/root/wipefs_6048.sh"
* deploy        -> ansible-playbook site.yml

The way the ansible implementation works is by checking if the input of devices will change the deployment strategy after filtering. For example:

input: /dev/sda /dev/sdb /dev/nvme0n1

And /dev/nvme0n1 gets filtered out, then this changes the stragey from "mixed devices" (spinning and solid), to "single type" (only one type of device). That would cause an immediate halt to the playbook.

If devices are getting filtered out that don't change the strategy then *there should not be an error*. For example:

input: /dev/sda /dev/sdb /dev/nvme0n1 /dev/nvme1n1

And /dev/nvme0n1 gets filtered, the strategy doesn't change, because there is still a mixed type group of devices.

So something is out of whack here when you say you try with 2 NVMe devices and things don't work, and then try with 1 NVMe device and it works.

One thing I would try is call the report after purging+wipefs, to see if devices are getting filtered out:

> ceph-volume lvm batch --report --format=json  /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/nvme0n1

If that runs *before* deployment, and *after* purging+wipefs, it should work.

Comment 17 Alfredo Deza 2018-11-16 20:08:20 UTC

The example call I used was with one nvme device, but I would try with two as well and toy around with what the reporting says. The JSON reporting is more verbose for us (and useful for this BZ) but you might find it easier to use the normal (pretty) reporting

Comment 18 John Harrigan 2018-11-16 20:19:00 UTC

How do I run ceph-volume after purging?
The cmd will no longer be installed on the OSD nodes.

Am I understanding the syntax (and ordering) in osds.yml correctly?
For example, using this:
  osd_objectstore: bluestore
  # use 'ceph-volume lvm batch' mode
  osd_scenario: lvm
  devices:
    - /dev/sdc
    - /dev/sdd
    - /dev/nvme0n1
    - /dev/sdq
    - /dev/sdr
    - /dev/nvme1n1

I would expect that the first two HDDs would hold 'block' portions of OSDs
and /dev/nvme0n1 would house their 'db'. The next two HDDs (sdq and sdr) would
be another set of 'block' OSDs and their 'db' would be on /dev/nvme1n1.
Should I be specifying devices in a different order?
Again the cfg I am trying to get to is to have 4 HDDs in use as OSDs and place
two of their 'db' on one nvme and the other two 'db' on the other nvme.

thanks

Comment 19 Alfredo Deza 2018-11-16 20:31:09 UTC

Another thing that would help the output is to increase the verbosity with -vv
and export:

> ANSIBLE_STDOUT_CALLBACK=debug

Comment 20 Alfredo Deza 2018-11-16 20:34:31 UTC

John, the ordering doesn't matter, the batch sub-command will create one large VG for both NVMe devices.

You cannot split them like that in one go. If you really want to you could try a first run with half of it and the rest with the other half.

"batch" mode is not meant to be super flexible, it allows you to do without the LV creation, which takes a great deal of code to implement right so the constraints are there to allow a robust execution.

Comment 21 Ben England 2018-11-20 17:35:04 UTC

But with LVM it is possible to state that you want a particular LV created on a particular PV within the VG.  So it is possible to round-robin the RocksDB and journaling LVs across the available SSD PVs, even if all SSDs are in the same VG.  For example, if you have an array of PVs, you can index into the SSD PV list using the LV's array index modulo N where N is the number of SSD PVs.

This is really critical functionality for ceph-ansible.  For example, one Ceph site that I know of has 60 HDDs/host, they will definitely have to balance their Bluestore RocksDB and journal space evenly across the SSD devices, and this will not be possible with this release of ceph-ansible if I understand John H correctly.  Failure to implement this results in very substandard performance where the SSD becomes the bottleneck instead of the HDD, particularly for sequential writes or large random writes.

Comment 22 Alfredo Deza 2018-11-20 19:54:07 UTC

@Ben, you are right that LVM does allow all of these configurations. The `lvm batch` sub-command was *not* meant to allow further customization from what it already offers.

That is why we have the ability to receive pre-made LVs to produce OSDs: so that ceph-volume doesn't need to accommodate every configuration possible with LVM.

In the end it is a decision between an easy deployment and a highly configurable one, we can't do both in `lvm batch`

Having said that, if we really want to push forward with more configurable scenarios and LVM, then that should probably go into ceph-ansible, that way the LVs could just be consumed in whatever way they were produced.

Comment 23 John Harrigan 2018-11-20 21:49:02 UTC

The purge-cluster.yml output indicates that it *did* run the osd lvm zap
commands on the OSD nodes. The complete output from the purge run is in
attachment "purge runlog using ceph-ansible RC2"

Here is an excerpt

TASK [zap and destroy osds created by ceph-volume with devices] ****************
Friday 16 November 2018  15:10:22 +0000 (0:00:01.692)       0:01:23.648 ******* 
changed: [c07-h09-6048r] => (item=/dev/sdc)
changed: [c07-h05-6048r] => (item=/dev/sdc)
changed: [c07-h01-6048r] => (item=/dev/sdc)
changed: [c07-h17-6048r] => (item=/dev/sdc)
changed: [c07-h13-6048r] => (item=/dev/sdc)
changed: [c07-h09-6048r] => (item=/dev/sdd)
changed: [c07-h25-6048r] => (item=/dev/sdc)
changed: [c07-h05-6048r] => (item=/dev/sdd)
changed: [c07-h21-6048r] => (item=/dev/sdc)
changed: [c07-h01-6048r] => (item=/dev/sdd)
changed: [c07-h17-6048r] => (item=/dev/sdd)
changed: [c06-h01-6048r] => (item=/dev/sdc)
changed: [c07-h13-6048r] => (item=/dev/sdd)
changed: [c05-h29-6048r] => (item=/dev/sdc)
changed: [c06-h05-6048r] => (item=/dev/sdc)
changed: [c07-h25-6048r] => (item=/dev/sdd)
changed: [c07-h21-6048r] => (item=/dev/sdd)
changed: [c07-h09-6048r] => (item=/dev/nvme0n1)
changed: [c06-h09-6048r] => (item=/dev/sdc)
changed: [c07-h05-6048r] => (item=/dev/nvme0n1)
changed: [c07-h01-6048r] => (item=/dev/nvme0n1)
changed: [c06-h01-6048r] => (item=/dev/sdd)
changed: [c06-h13-6048r] => (item=/dev/sdc)
changed: [c05-h29-6048r] => (item=/dev/sdd)
changed: [c07-h17-6048r] => (item=/dev/nvme0n1)
changed: [c06-h05-6048r] => (item=/dev/sdd)
changed: [c07-h13-6048r] => (item=/dev/nvme0n1)
changed: [c07-h09-6048r] => (item=/dev/sdq)
changed: [c07-h05-6048r] => (item=/dev/sdq)
changed: [c06-h09-6048r] => (item=/dev/sdd)
changed: [c07-h21-6048r] => (item=/dev/nvme0n1)
changed: [c07-h01-6048r] => (item=/dev/sdq)
changed: [c06-h13-6048r] => (item=/dev/sdd)
changed: [c07-h17-6048r] => (item=/dev/sdq)
changed: [c07-h13-6048r] => (item=/dev/sdq)
changed: [c07-h25-6048r] => (item=/dev/nvme0n1)
changed: [c07-h09-6048r] => (item=/dev/sdr)
changed: [c07-h05-6048r] => (item=/dev/sdr)
changed: [c06-h01-6048r] => (item=/dev/nvme0n1)
changed: [c05-h29-6048r] => (item=/dev/nvme0n1)
changed: [c07-h21-6048r] => (item=/dev/sdq)
changed: [c07-h01-6048r] => (item=/dev/sdr)
changed: [c06-h05-6048r] => (item=/dev/nvme0n1)
changed: [c07-h17-6048r] => (item=/dev/sdr)
changed: [c07-h09-6048r] => (item=/dev/nvme1n1)
changed: [c07-h13-6048r] => (item=/dev/sdr)
changed: [c06-h09-6048r] => (item=/dev/nvme0n1)
changed: [c07-h05-6048r] => (item=/dev/nvme1n1)
changed: [c05-h29-6048r] => (item=/dev/sdq)
changed: [c06-h01-6048r] => (item=/dev/sdq)
changed: [c07-h01-6048r] => (item=/dev/nvme1n1)
changed: [c07-h21-6048r] => (item=/dev/sdr)
changed: [c07-h25-6048r] => (item=/dev/sdq)
changed: [c06-h05-6048r] => (item=/dev/sdq)
changed: [c06-h13-6048r] => (item=/dev/nvme0n1)
changed: [c07-h17-6048r] => (item=/dev/nvme1n1)
changed: [c07-h13-6048r] => (item=/dev/nvme1n1)
changed: [c06-h09-6048r] => (item=/dev/sdq)
changed: [c07-h21-6048r] => (item=/dev/nvme1n1)
changed: [c05-h29-6048r] => (item=/dev/sdr)
changed: [c06-h01-6048r] => (item=/dev/sdr)
changed: [c07-h25-6048r] => (item=/dev/sdr)
changed: [c06-h05-6048r] => (item=/dev/sdr)
changed: [c06-h13-6048r] => (item=/dev/sdq)
changed: [c06-h09-6048r] => (item=/dev/sdr)
changed: [c05-h29-6048r] => (item=/dev/nvme1n1)
changed: [c06-h01-6048r] => (item=/dev/nvme1n1)
changed: [c07-h25-6048r] => (item=/dev/nvme1n1)
changed: [c06-h05-6048r] => (item=/dev/nvme1n1)
changed: [c06-h13-6048r] => (item=/dev/sdr)
changed: [c06-h09-6048r] => (item=/dev/nvme1n1)
changed: [c06-h13-6048r] => (item=/dev/nvme1n1)

I remain concerned that purge is not fully cleaning up to allow a clean
redeploy when using ceph-volume lvm batch mode.

- John

Comment 24 Alfredo Deza 2018-11-20 21:52:09 UTC

Could you ensure you are running with:

> ANSIBLE_STDOUT_CALLBACK=debug

And with the -vv flag in ansible?

Comment 25 Andrew Schoen 2018-11-20 22:41:23 UTC

John,

This PR should handle the issue we found with subsequent deploys not being idempotent because of the strategy change.

https://github.com/ceph/ceph-ansible/pull/3348

Comment 26 Christina Meno 2018-11-27 16:20:49 UTC

maybe blocker -- investigating now

Comment 27 Alfredo Deza 2018-11-28 15:44:35 UTC

we can't reproduce this by deploying on a new system with several NVMe devices, there is a chance the usage of purge+redeploy might hit this, and there is a fix already for the idempotency issue that was found. 

We don't think this is a blocker

Comment 28 John Harrigan 2018-11-28 18:54:32 UTC

Did some additional testing today using RC4. I am still not able to deploy
using ceph lvm batch

Tested deploy/purge cycle first with ceph-disk (successful) and then ceph-volume (failed).
I will attach the full ceph-ansible logfile "DeployLVMbatchRC4.Nov28"


RHCS 3.2 CEPH-DISK : ceph-ansible.noarch         3.2.0-0.1.rc2.el7cp
==================
1) Purged existing RHCS 3.2 cluster (ceph-disk non-collocated, bluestore)

2) Deployed RHCS 3.2 cluster (ceph-disk non-collocated, bluestore)
SUCCESS - no failed tasks and running cluster with expected number OSDs and RGWs

3) Purged existing RHCS 3.2 cluster (ceph-disk non-collocated, bluestore)

===================================> CEPH-VOLUME <======================
4) Installed latest ceph-ansible (RC4)
# yum update ceph-ansible
  ceph-ansible.noarch         3.2.0-0.1.rc4.el7cp

5) Deployed RHCS 3.2 cluster (ceph-volume lvm batch, bluestore)
# cat osds.yml
  #--------------------------------------------------------------------
  osd_objectstore: bluestore
  # use 'ceph-volume lvm batch' mode
  osd_scenario: lvm
  devices:
    - /dev/sdc
    - /dev/sdd
    - /dev/nvme0n1
    - /dev/sdq
    - /dev/sdr
    - /dev/nvme1n1
# export ANSIBLE_STDOUT_CALLBACK=debug
# ansible-playbook -vv site.yml 2>&1 | tee -a DeployLVMbatchRC4.Nov28 
<...SNIP...>
TASK [ceph-config : generate ceph configuration file: ceph.conf] ***************
task path: /usr/share/ceph-ansible/roles/ceph-config/tasks/main.yml:77
Wednesday 28 November 2018  18:20:33 +0000 (0:00:01.020)       0:15:19.542 **** 
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: 	[line 16]: u' # non_hci_safety_factor is the safety factor for dedicated nodes\n'
fatal: [c07-h01-6048r]: FAILED! => {}

MSG:

Unexpected failure during module execution.

PLAY RECAP *********************************************************************
c03-h15-r620               : ok=22   changed=1    unreachable=0    failed=0   
c03-h17-r620               : ok=22   changed=1    unreachable=0    failed=0   
c03-h19-r620               : ok=22   changed=1    unreachable=0    failed=0   
c03-h21-r620               : ok=22   changed=1    unreachable=0    failed=0   
c04-h33-6018r              : ok=22   changed=1    unreachable=0    failed=0   
c05-h33-6018r              : ok=93   changed=12   unreachable=0    failed=0   
c06-h29-6018r              : ok=83   changed=10   unreachable=0    failed=0   
c07-h01-6048r              : ok=60   changed=6    unreachable=0    failed=1   
c07-h05-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h09-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h13-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h17-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h21-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h25-6048r              : ok=57   changed=6    unreachable=0    failed=1   
c07-h29-6018r              : ok=85   changed=13   unreachable=0    failed=0   
c07-h30-6018r              : ok=83   changed=11   unreachable=0    failed=0   


INSTALLER STATUS ***************************************************************
Install Ceph Monitor        : Complete (0:04:40)
Install Ceph Manager        : Complete (0:03:29)
Install Ceph OSD            : In Progress (0:03:59)
	This phase can be restarted by running: roles/ceph-osd/tasks/main.yml

Wednesday 28 November 2018  18:20:43 +0000 (0:00:09.903)       0:15:29.446 **** 
=============================================================================== 
<...SNIP...>

Comment 29 John Harrigan 2018-11-28 18:58:16 UTC

Created attachment 1509615 [details]
ansible -vv using ceph-ansibleRC4 - FAILED

Comment 30 Alfredo Deza 2018-11-28 19:13:47 UTC

> An exception occurred during task execution. To see the full traceback, use 
> -vvv. The error was: 	[line 16]: u' # non_hci_safety_factor is the safety factor for dedicated nodes\n'


This looks like it is unrelated to ceph-volume? The ceph configuration is failing to be generated.

What is "non_hci_safety_factor" ? Maybe Sebastien can help here

Comment 31 Ben England 2018-11-28 20:00:36 UTC

I think they were trying to conditionalize how much memory was reserved for Bluestore OSDs (i.e. OSD caching layer), depending on whether or not the Ceph OSD host had to also run other things (example: hyperconverged OpenStack).  Since you are running dedicated OSD hosts, you need the non_hci_safety_factor parameter.

Comment 32 Alfredo Deza 2018-11-28 20:21:17 UTC

@John, I suspect that Sebastien might want to see the traceback that is hidden from the log output at the verbose levels used. Can you re-run the ansible-playbook command with the -vvv flag and paste the full task failure?

Hopefully that will have the traceback, and enough information to see what is going on there.

Comment 33 John Harrigan 2018-11-28 21:10:57 UTC

reran using -vvv flag and here is the full task failure, repeated for
each of the OSD nodes...

MSG:

Unexpected failure during module execution.
The full traceback is:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 139, in run
    res = self._execute()
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 584, in _execute
    result = self._handler.run(task_vars=variables)
  File "/usr/share/ceph-ansible/plugins/actions/config_template.py", line 641, in run
    default_section=_vars.get('default_section', 'DEFAULT')
  File "/usr/share/ceph-ansible/plugins/actions/config_template.py", line 330, in return_config_overrides_ini
    config.readfp(config_object)
  File "/usr/lib64/python2.7/ConfigParser.py", line 324, in readfp
    self._read(fp, filename)
  File "/usr/share/ceph-ansible/plugins/actions/config_template.py", line 289, in _read
    raise e
ParsingError: File contains parsing errors: <???>
	[line 16]: u' # non_hci_safety_factor is the safety factor for dedicated nodes\n'
fatal: [c07-h25-6048r]: FAILED! => {}

MSG:

Unexpected failure during module execution.

Comment 34 John Harrigan 2018-11-28 21:12:37 UTC

Created attachment 1509646 [details]
ansible runlog using -vvv

Comment 35 seb 2018-11-28 21:39:48 UTC

John, which version of Ansible are you using?

I just tried enabling this, and I'm not able to reproduce your issue.
I suspect, it's a matter of Ansible version, I'm running 2.7.2.

Comment 36 John Harrigan 2018-11-29 00:08:38 UTC

# ansible --version
ansible 2.6.7
  config file = /usr/share/ceph-ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Sep 12 2018, 05:31:16) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

What ansible version is required for RHCS 3.2 ?

Comment 37 Ken Dreyer (Red Hat) 2018-11-29 01:23:20 UTC

> What ansible version is required for RHCS 3.2 ?

Ansible 2.6 (documented in bug 1613941)

Comment 38 seb 2018-11-29 08:47:58 UTC

The same failure has been reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1654441, let's change BZ for this conversation.

I'm moving this one to POST again, as the original issue has been solved.

Comment 43 Tiffany Nguyen 2018-12-05 06:53:30 UTC

Verified with ceph-ansible 3.2.0-0.1.rc8.el7cp.  Ceph-ansible deploys cluster two NVMe devices successfully.

Comment 45 errata-xmlrpc 2019-01-03 19:02:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020