Back to bug 1822134

Who When What Removed Added
subhash 2020-04-08 11:38:32 UTC Keywords AutomationBlocker
Christina Meno 2020-04-13 16:45:57 UTC CC aschoen
Assignee gmeno aschoen
Flags needinfo?(aschoen)
Andrew Schoen 2020-04-13 18:30:58 UTC Flags needinfo?(vpoliset)
Tejas 2020-04-15 10:22:43 UTC Target Release 5.* 4.1
CC tchandra
Severity medium high
subhash 2020-04-15 13:05:13 UTC Flags needinfo?(aschoen) needinfo?(vpoliset)
subhash 2020-04-15 22:46:21 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-04-16 15:36:12 UTC Flags needinfo?(aschoen) needinfo?(vpoliset)
Andrew Schoen 2020-04-16 16:28:56 UTC Flags needinfo?(vpoliset)
subhash 2020-04-17 05:22:02 UTC Flags needinfo?(vpoliset) needinfo?(vpoliset)
subhash 2020-04-17 18:05:13 UTC CC akupczyk, bhubbard, dzafman, kchai, nojha, rzarzyns, sseshasa
Component Ceph-Volume RADOS
Assignee aschoen nojha
QA Contact vashastr mmurthy
Josh Durgin 2020-04-17 18:30:55 UTC CC jdurgin
Josh Durgin 2020-04-22 15:55:10 UTC Component RADOS Ceph-Volume
Assignee nojha gmeno
QA Contact mmurthy vashastr
subhash 2020-04-22 23:07:58 UTC Flags needinfo?(aschoen)
Hemanth Kumar 2020-04-23 12:00:35 UTC CC hyelloji
QA Contact vashastr vpoliset
Christina Meno 2020-04-23 14:27:47 UTC Priority unspecified high
Status NEW ASSIGNED
Assignee gmeno aschoen
Andrew Schoen 2020-04-23 14:39:28 UTC Flags needinfo?(aschoen)
Tejas 2020-04-24 10:55:49 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-04-24 13:33:37 UTC Flags needinfo?(aschoen)
subhash 2020-04-27 12:01:23 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-04-27 15:09:02 UTC Flags needinfo?(aschoen)
Michal Sekletar 2020-05-04 05:23:23 UTC CC msekleta
Drew Harris 2020-05-04 14:44:13 UTC CC anharris
Andrew Schoen 2020-05-04 16:16:39 UTC Blocks 1831105
Andrew Schoen 2020-05-04 16:23:22 UTC Target Milestone rc z1
Andrew Schoen 2020-05-07 14:54:39 UTC Doc Text Cause:
Using partitions for the the --block.db and --block.wal arguments of the ceph-volume lvm create command. The db and wal options for the `lvm_volumes` config option in ceph-ansible is used to set those arguments during a deployment.

Consequence:
Occasionally the OSD will not start because udev resets the partitions permissions back to root:disk after creation by ceph-volume.

Workaround (if any):
Start the ceph-volume systemd unit manually for the failed OSD. For example, if the failed OSD has an ID of 8 the workaround would be running `systemctl start 'ceph-volume@lvm-8-*'`. If you know the failed OSDs UUID as well you can use the service command: `service ceph-volume@lvm-8-4c6ddc44-9037-477d-903c-63b5a789ade5 start`, where 4c6ddc44-9037-477d-903c-63b5a789ade5 is the UUID for osd.8.

Result:
Permissions on the partitions affected would be changed back to ceph:ceph and the OSD will be restarted and join the cluster.
Doc Type If docs needed, set a value Known Issue
Aron Gunn 2020-05-08 15:01:06 UTC Blocks 1816167
Aron Gunn 2020-05-08 17:16:25 UTC CC agunn
Doc Text Cause:
Using partitions for the the --block.db and --block.wal arguments of the ceph-volume lvm create command. The db and wal options for the `lvm_volumes` config option in ceph-ansible is used to set those arguments during a deployment.

Consequence:
Occasionally the OSD will not start because udev resets the partitions permissions back to root:disk after creation by ceph-volume.

Workaround (if any):
Start the ceph-volume systemd unit manually for the failed OSD. For example, if the failed OSD has an ID of 8 the workaround would be running `systemctl start 'ceph-volume@lvm-8-*'`. If you know the failed OSDs UUID as well you can use the service command: `service ceph-volume@lvm-8-4c6ddc44-9037-477d-903c-63b5a789ade5 start`, where 4c6ddc44-9037-477d-903c-63b5a789ade5 is the UUID for osd.8.

Result:
Permissions on the partitions affected would be changed back to ceph:ceph and the OSD will be restarted and join the cluster.
.Ceph OSD fails to start because `udev` resets the permissions for BlueStore DB and WAL devices

When specifying the BlueStore DB and WAL partitions for an OSD using the `ceph-volume lvm create` command or specifying the partitions using the `lvm_volume` option with Ceph Ansible, can cause those devices to fail on start up. The `udev` subsystem resets the partition permissions back to `root:disk`.

To workaround this issue, manually start the systemd `ceph-volume` service. For example, to start the OSD with an ID of 8, run the following: `systemctl start 'ceph-volume@lvm-8-*'`. You can also use the `service` command, for example: `service ceph-volume@lvm-8-4c6ddc44-9037-477d-903c-63b5a789ade5 start`. Manually starting the OSD results in the partition having the correct permission, `ceph:ceph`.
Karen Norteman 2020-05-08 18:00:12 UTC CC knortema
Doc Text .Ceph OSD fails to start because `udev` resets the permissions for BlueStore DB and WAL devices

When specifying the BlueStore DB and WAL partitions for an OSD using the `ceph-volume lvm create` command or specifying the partitions using the `lvm_volume` option with Ceph Ansible, can cause those devices to fail on start up. The `udev` subsystem resets the partition permissions back to `root:disk`.

To workaround this issue, manually start the systemd `ceph-volume` service. For example, to start the OSD with an ID of 8, run the following: `systemctl start 'ceph-volume@lvm-8-*'`. You can also use the `service` command, for example: `service ceph-volume@lvm-8-4c6ddc44-9037-477d-903c-63b5a789ade5 start`. Manually starting the OSD results in the partition having the correct permission, `ceph:ceph`.
.Ceph OSD fails to start because `udev` resets the permissions for BlueStore DB and WAL devices

When specifying the BlueStore DB and WAL partitions for an OSD using the `ceph-volume lvm create` command or specifying the partitions, using the `lvm_volume` option with Ceph Ansible can cause those devices to fail on startup. The `udev` subsystem resets the partition permissions back to `root:disk`.

To work around this issue, manually start the systemd `ceph-volume` service. For example, to start the OSD with an ID of 8, run the following: `systemctl start 'ceph-volume@lvm-8-*'`. You can also use the `service` command, for example: `service ceph-volume@lvm-8-4c6ddc44-9037-477d-903c-63b5a789ade5 start`. Manually starting the OSD results in the partition having the correct permission, `ceph:ceph`.
Christina Meno 2020-06-10 15:03:37 UTC CC gmeno
Christina Meno 2020-06-17 15:36:08 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-06-17 16:52:31 UTC Flags needinfo?(aschoen)
Christina Meno 2020-06-24 13:59:50 UTC Target Milestone z1 z2
Alasdair Kergon 2020-07-14 22:34:03 UTC CC agk, prajnoha
Andrew Schoen 2020-08-06 14:57:50 UTC Flags needinfo?(agk)
Christina Meno 2020-08-26 20:19:33 UTC Target Release 4.1 4.2
Target Milestone z2 rc
Andrew Schoen 2020-09-22 19:10:39 UTC Link ID Github ceph/ceph/pull/37319
Vikhyat Umrao 2020-09-22 19:16:00 UTC CC vumrao
Veera Raghava Reddy 2020-10-20 09:25:15 UTC CC vereddy
Yaniv Kaul 2020-10-27 12:32:31 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-10-27 14:34:57 UTC Flags needinfo?(aschoen)
PnT Account Manager 2020-11-04 18:40:07 UTC QA Contact vpoliset amsyedha
Drew Harris 2020-11-09 16:49:59 UTC Target Milestone rc ---
Yaniv Kaul 2020-11-12 19:42:19 UTC Flags needinfo?(aschoen)
Andrew Schoen 2020-11-13 19:48:51 UTC Flags needinfo?(aschoen)
Drew Harris 2020-11-16 21:44:07 UTC Target Release 4.2 4.2z1
Andrew Schoen 2020-11-25 15:52:44 UTC CC vashastr
Flags needinfo?(vashastr)
Vasishta 2020-12-01 16:52:43 UTC Flags needinfo?(vashastr)
Andrew Schoen 2020-12-03 14:57:57 UTC Flags needinfo?(vashastr)
Vasishta 2020-12-09 06:00:56 UTC CC amsyedha
Flags needinfo?(vashastr) needinfo?(amsyedha)
Ameena Suhani S H 2020-12-11 10:11:31 UTC Flags needinfo?(amsyedha)
Veera Raghava Reddy 2021-01-27 09:05:55 UTC Status ASSIGNED CLOSED
Resolution --- WONTFIX
Last Closed 2021-01-27 09:05:55 UTC
Red Hat Bugzilla 2023-09-15 00:30:54 UTC Flags needinfo?(agk)

Back to bug 1822134