.When putting a dedicated journal on an NVMe device installation can fail
When the `dedicated_devices` setting contains an NVMe device and it has partitions or signatures on it Ansible installation might fail with an error like the following:
----
journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected c325f439-6849-47ef-ac43-439d9909d391, invalid (someone else's?) journal
----
To work around this issue, ensure there are no partitions or signatures on the NVMe device.
This error is reported by ceph itself when doing mkfs. Was the journal device purged correctly and removed from any Ceph metadata on it?
Thanks.
This does not seem like a Ceph ansible issue, although we could run checks for this I suppose.
osd_auto_discovery is false so I believe the device specified in inventory should be used cleaned up by ansible. teuthology also does its own cleanup as well at the beginning and I need to check that
John,
I was not using lv_create option in ceph-ansible during that time, I think we can add to clean up any old partitions manually if admin hits this issue and retry again.
Thanks
Description of problem: I am not sure if this is ceph-ansible or core Ceph osd issue, Please feel free to change after first level analysis specify dedicated journal on nvme with other config as [clients] pluto005.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' [mdss] pluto008.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' [mgrs] pluto004.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' [mons] pluto004.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' pluto009.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' [osds] pluto005.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' pluto006.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' pluto010.ceph.redhat.com dedicated_devices='["/dev/nvme0n1"]' devices='["/dev/sdb"]' monitor_interface='eno1' public_network='10.8.128.0/21' radosgw_interface='eno1' Running sensible playbook and following issue is seen 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:got monmap epoch 1 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:2018-08-18 21:04:24.837653 7f4463512d80 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:2018-08-18 21:04:24.837685 7f4463512d80 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected c325f439-6849-47ef-ac43-439d9909d391, invalid (someone else's?) journal 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:2018-08-18 21:04:24.837727 7f4463512d80 -1 filestore(/var/lib/ceph/tmp/mnt.7k5fVX) mkjournal(1068): error creating journal on /var/lib/ceph/tmp/mnt.7k5fVX/journal: (22) Invalid argument 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:2018-08-18 21:04:24.837783 7f4463512d80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (22) Invalid argument 2018-08-18T17:04:26.885 INFO:teuthology.orchestra.run.pluto009.stdout:2018-08-18 21:04:24.837855 7f4463512d80 -1 [0;31m ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.7k5fVX: (22) Invalid argument[0m 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout:mount_activate: Failed to activate 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout:Traceback (most recent call last): 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: File "/sbin/ceph-disk", line 9, in <module> 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5735, in run 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: main(sys.argv[1:]) 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5688, in main 2018-08-18T17:04:26.886 INFO:teuthology.orchestra.run.pluto009.stdout: main_catch(args.func, args) 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5713, in main_catch 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: func(args) 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3776, in main_activate 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: reactivate=args.reactivate, 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3539, in mount_activate 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: (osd_id, cluster) = activate(path, activate_key_template, init) 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3716, in activate 2018-08-18T17:04:26.887 INFO:teuthology.orchestra.run.pluto009.stdout: keyring=keyring, 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3183, in mkfs 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: '--setgroup', get_ceph_group(), 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 566, in command_check_call 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: return subprocess.check_call(arguments) 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout: raise CalledProcessError(retcode, cmd) 2018-08-18T17:04:26.888 INFO:teuthology.orchestra.run.pluto009.stdout:subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'0', '--monmap', '/var/lib/ceph/tmp/mnt.7k5fVX/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.7k5fVX', '--osd-journal', '/var/lib/ceph/tmp/mnt.7k5fVX/journal', '--osd-uuid', u'c325f439-6849-47ef-ac43-439d9909d391', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1 Full logs: http://magna002.ceph.redhat.com/rakesh-2018-08-17_07:29:56-smoke-luminous-distro-basic-pluto/306626/teuthology.log