Description of problem: ceph osd containers are not coming up. During deployment, the ceph-ansible task in subject retries 60 times until eventually failing [1]. When examining the nodes, the osd containers do not stay up. A look into journalctl [2] shows the same entries as the containers fail to restart. Version-Release number of selected component (if applicable): RHOS_TRUNK-15.0-RHEL-8-20190819.n.1 How reproducible: 100% Steps to Reproduce: 1. Deploy rhos15 with ceph Actual results: fails on ceph-asible Expected results: deployment should succeed, osd containers should be up Additional info: [1] http://pastebin.test.redhat.com/790224 [2] http://pastebin.test.redhat.com/790222
Created attachment 1606233 [details] Undercloud folders
[2019-08-20 18:22:36,358][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-d58085a4-d014-4dd8-b3ed-e73ef994d7c8/osd-data-b3ce8c3d-742b-4d88-88b1-7c951e7a6bc4,ceph.block_uuid=1V1DKt-WQi5-ndXO-Yc3e-K9Hp-gcUV-ib8XiU,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=b24ad492-c351-11e9-9e81-525400c13f4a,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=b5cb48bd-e897-44bc-b646-c493d590eebb,ceph.osd_id=11,ceph.type=block,ceph.vdo=0";"/dev/ceph-d58085a4-d014-4dd8-b3ed-e73ef994d7c8/osd-data-b3ce8c3d-742b-4d88-88b1-7c951e7a6bc4";"osd-data-b3ce8c3d-742b-4d88-88b1-7c951e7a6bc4";"ceph-d58085a4-d014-4dd8-b3ed-e73ef994d7c8";"1V1DKt-WQi5-ndXO-Yc3e-K9Hp-gcUV-ib8XiU";"10.00g [2019-08-20 18:22:36,360][ceph_volume.process][INFO ] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1 [2019-08-20 18:22:36,363][ceph_volume.process][INFO ] Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-1 [2019-08-20 18:22:36,364][ceph_volume][ERROR ] exception caught by decorator Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc return f(*a, **kw) File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 148, in main terminal.dispatch(self.mapper, subcommand_args) File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 205, in dispatch instance.main() File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 40, in main terminal.dispatch(self.mapper, self.argv) File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 205, in dispatch instance.main() File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 341, in main self.activate(args) File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root return func(*a, **kw) File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 265, in activate activate_bluestore(lvs, no_systemd=args.no_systemd) File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 139, in activate_bluestore prepare_utils.create_osd_path(osd_id, tmpfs=True) File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 223, in create_osd_path mount_tmpfs(path) File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 216, in mount_tmpfs system.set_context(path) File "/usr/lib/python3.6/site-packages/ceph_volume/util/system.py", line 284, in set_context process.run(['restorecon', path]) File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 121, in run terminal.write(command_msg) File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 134, in write return _Write().raw(msg) File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 117, in raw self.write(string) File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 120, in write self._writer.write(self.prefix + line + self.suffix) ValueError: I/O operation on closed file. [root@ceph-2 ceph]#
Please specify the severity of this bug. If severity is not set, "medium" will be assumed. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.
We had this problem with 14.2.2-431.gb28c939.el8cp but not with 14.2.2-339.g1fd0f60.el8cp
*** Bug 1749465 has been marked as a duplicate of this bug. ***
ceph-volume team there's a lot of comments on this bug about using different containers as workarounds, but I hope that doesn't distract you. This is a ceph-volume bug and comment #3 shows the problem we encountered while trying to use ceph-volume. It's been 3 weeks; would you please triage the bug?
I think this is not
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0312