Bug 1413159
Summary: | Ceph - Using service command to start OSD in 1.3.x terminates other OSD processes on the node. | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | jquinn <jquinn> |
Component: | Ceph-Disk | Assignee: | Kefu Chai <kchai> |
Status: | CLOSED WONTFIX | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
Severity: | medium | Docs Contact: | Erin Donnelly <edonnell> |
Priority: | high | ||
Version: | 1.3.3 | CC: | asriram, edonnell, flucifre, hnallurv, jquinn, kchai, kdreyer, mmurthy, nlevine, vumrao |
Target Milestone: | rc | Keywords: | Reopened |
Target Release: | 1.3.4 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-0.94.10-2.el7cp Ubuntu: ceph_0.94.10-3redhat1xenial | Doc Type: | Known Issue |
Doc Text: |
.Running `service ceph start osd.x` on a Red Hat Ceph Storage 1.3.x cluster causes the other OSDs on that node to stop
`service` is using `systemd` to manage the lifecycle of services, but `ceph` is a `systemd` service automatically generated from its `sysv` counterpart. Therefore, the generated `systemd` service unit is not able to differentiate osd.0 from other services managed by the `ceph` service, and `systemd` believes that the `ceph` service "exited" when the system reboots. This is why the `ceph` service stops all services before starting a certain OSD instance.
After `ceph` stops all services, the status of the `ceph` service is marked `dead`, therefor it does not bother to kill it again when the user tries to restart a certain OSD instance. This is why the user is able to start other OSD instances again once they are killed by the first `service ceph start osd.x`.
For more information about this issue, including a workaround, refer to https://access.redhat.com/solutions/2877891.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-03-16 05:45:50 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1372735, 1561412 |
Description
jquinn
2017-01-13 18:43:32 UTC
Changing it to 1.3.z release as we sysvinit is deprecated in RHCS 2.x(jewel). could related to how we use pid file to manage the life cycle of a daemon. will try to reproduce it with more verbose log locally. i printed "ps aux|grep ceph-osd" in the beginning of "/etc/init.d/ceph", like diff --git a/src/init-ceph.in b/src/init-ceph.in index 7bcfda4..fb57406 100644 --- a/src/init-ceph.in +++ b/src/init-ceph.in @@ -21,6 +21,8 @@ fi SYSTEMD_RUN=$(which systemd-run 2>/dev/null) grep -qs systemd /proc/1/comm || SYSTEMD_RUN="" +ps aux|grep ceph-osd + # if we start up as ./init-ceph, assume everything else is in the # current directory too. if [ `dirname $0` = "." ] && [ $PWD != "/etc/init.d" ]; then and i got [root]# ps aux|grep ceph-osd root 2622 0.0 0.0 115244 1460 ? Ss 15:44 0:00 /usr/bin/bash -c ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128MB /usr/bin/ceph-osd -i 30 --pid-file /var/run/ceph/osd.30.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 2630 0.0 0.1 911132 18624 ? Sl 15:44 0:00 /usr/bin/ceph-osd -i 30 --pid-file /var/run/ceph/osd.30.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 2718 0.0 0.0 115244 1456 ? Ss 15:44 0:00 /usr/bin/bash -c ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128MB /usr/bin/ceph-osd -i 32 --pid-file /var/run/ceph/osd.32.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 2726 0.0 0.1 911128 18616 ? Sl 15:44 0:00 /usr/bin/ceph-osd -i 32 --pid-file /var/run/ceph/osd.32.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 2875 0.0 0.0 115244 1456 ? Ss 15:44 0:00 /usr/bin/bash -c ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=128MB /usr/bin/ceph-osd -i 31 --pid-file /var/run/ceph/osd.31.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 2882 0.0 0.1 911128 22560 ? Sl 15:44 0:00 /usr/bin/ceph-osd -i 31 --pid-file /var/run/ceph/osd.31.pid -c /etc/ceph/ceph.conf --cluster ceph -f root 6241 0.0 0.0 112648 960 pts/0 S+ 15:53 0:00 grep --color=auto ceph-osd [root]# service ceph start osd.30 + '[' -e /lib/lsb/init-functions ']' + . /lib/lsb/init-functions ++ which systemd-run + SYSTEMD_RUN=/bin/systemd-run + grep -qs systemd /proc/1/comm + ps aux + grep ceph-osd root 6826 0.0 0.0 9032 656 pts/0 S+ 15:54 0:00 grep ceph-osd ++ dirname /etc/init.d/ceph in other words, all ceph-osd processes are killed before "/etc/init.d/ceph" kicks in. in /usr/sbin/service if [ -f "${SERVICEDIR}/${SERVICE}" ]; then # LSB daemons that dies abnormally in systemd looks alive in systemd's eyes due to RemainAfterExit=yes # lets reap them before next start if [ "${ACTION}" = "start" ] && \ systemctl show -p ActiveState ${SERVICE}.service | grep -q '=active$' && \ systemctl show -p SubState ${SERVICE}.service | grep -q '=exited$' ; then /bin/systemctl stop ${SERVICE}.service fi it stops "ceph" if the "ActiveState" is "active" and "SubState is "exited". # systemctl show -p ActiveState -p SubState ceph ActiveState=active SubState=exited # systemctl status ceph -l ● ceph.service - LSB: Start Ceph distributed file system daemons at boot time Loaded: loaded (/etc/rc.d/init.d/ceph; bad; vendor preset: disabled) Active: active (exited) since Mon 2017-03-13 16:04:19 IST; 24min ago Docs: man:systemd-sysv-generator(8) Process: 1312 ExecStart=/etc/rc.d/init.d/ceph start (code=exited, status=0/SUCCESS) Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + '[' 0 -eq 0 ']' Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + touch /var/lock/subsys/ceph Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + '[' start = start -a /usr/bin '!=' . ']' Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + '[' '' = '' ']' Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + '[' -x /usr/sbin/ceph-disk ']' Mar 13 16:04:18 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + ceph-disk activate-all Mar 13 16:04:19 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: ERROR:ceph-disk:Failed to activate Mar 13 16:04:19 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: ceph-disk: Error: another ceph osd.30 already mounted in position (old/different cluster instance?); unmounting ours. Mar 13 16:04:19 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: ceph-disk: Error: One or more partitions failed to activate Mar 13 16:04:19 dell-per630-9.gsslab.pnq2.redhat.com ceph[1312]: + exit 0 so, when the system boots, "ceph-disk activate-all" failed with above error, thus systemd marks the SubState as "exited". Vikhyat, why do we have a 4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf in /dev/disk/by-parttypeuuid/, the name is started with the prefix of "4fbd7e29-9d25-41b8-afd0-062c0ceff05d", which is a mark used by ceph-disk to note that "it is a ready-to-use ceph osd partition". that's why we failed to mount it to /var/lib/ceph/osd/ceph-30. hence ceph-disk failed. # ceph-disk -v activate-all DEBUG:ceph-disk:Scanning /dev/disk/by-parttypeuuid INFO:ceph-disk:Activating /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs DEBUG:ceph-disk:Mounting /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf on /var/lib/ceph/tmp/mnt.DWsjYq with options noatime,inode64 INFO:ceph-disk:Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf /var/lib/ceph/tmp/mnt.DWsjYq INFO:ceph-disk:Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.DWsjYq DEBUG:ceph-disk:Cluster uuid is 444f54b1-f97f-43d8-85b7-d5a02daac39a INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid DEBUG:ceph-disk:Cluster name is ceph DEBUG:ceph-disk:OSD uuid is ec76bcd2-d53c-40dc-a649-47b0d99a7baf DEBUG:ceph-disk:OSD id is 30 INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init DEBUG:ceph-disk:Marking with init system sysvinit DEBUG:ceph-disk:ceph osd.30 data dir is ready at /var/lib/ceph/tmp/mnt.DWsjYq INFO:ceph-disk:/var/lib/ceph/osd/ceph-30 is not empty, won't override ERROR:ceph-disk:Failed to activate DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.DWsjYq INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.DWsjYq ceph-disk: Error: another ceph osd.30 already mounted in position (old/different cluster instance?); unmounting ours. ceph-disk: Error: One or more partitions failed to activate (In reply to Kefu Chai from comment #22) > Vikhyat, why do we have a > 4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c-40dc-a649-47b0d99a7baf in > /dev/disk/by-parttypeuuid/, the name is started with the prefix of > "4fbd7e29-9d25-41b8-afd0-062c0ceff05d", which is a mark used by ceph-disk to > note that "it is a ready-to-use ceph osd partition". that's why we failed to > mount it to /var/lib/ceph/osd/ceph-30. > Hi Kefu, Thanks for your inputs. But I am not sure why this prefix is there because I checked looks like it comes from ceph-disk prepare or create? Because it is same in jewel and hammer. In Jewel: # cd /dev/disk/by-parttypeuuid/ [root@kilo1 by-parttypeuuid]# ll total 0 lrwxrwxrwx 1 root root 10 Feb 23 06:16 45b0969e-9b03-4f30-b4c6-b4b80ceff106.c4d58c1a-acd5-4ba6-89f6-e19b10e6ed5c -> ../../vdc2 lrwxrwxrwx 1 root root 10 Feb 23 06:16 45b0969e-9b03-4f30-b4c6-b4b80ceff106.cbd34227-62a2-41b7-ac28-1e57a8b9d561 -> ../../vdb2 lrwxrwxrwx 1 root root 10 Feb 23 05:14 4fbd7e29-9d25-41b8-afd0-062c0ceff05d.115ab8f0-f8d3-43fa-bc57-fb38ba55bbef -> ../../vdb1 lrwxrwxrwx 1 root root 10 Feb 23 05:14 4fbd7e29-9d25-41b8-afd0-062c0ceff05d.3b236137-dcf9-41f6-9981-b123c96f89ab -> ../../vdc1 # ceph-disk list /dev/dm-0 other, xfs, mounted on / /dev/dm-1 swap, swap /dev/dm-2 other, xfs, mounted on /home /dev/sr0 other, unknown /dev/vda : /dev/vda2 other, LVM2_member /dev/vda1 other, xfs, mounted on /boot /dev/vdb : /dev/vdb2 ceph journal, for /dev/vdb1 /dev/vdb1 ceph data, active, cluster ceph, osd.1, journal /dev/vdb2 /dev/vdc : /dev/vdc2 ceph journal, for /dev/vdc1 /dev/vdc1 ceph data, active, cluster ceph, osd.3, journal /dev/vdc2 In Hammer: # cd /dev/disk/by-parttypeuuid/ # ll total 0 lrwxrwxrwx 1 root root 10 May 23 2016 0fc63daf-8483-4772-8e79-3d69d8477de4.0cff6bf0-596b-453b-ae61-4147717d1047 -> ../../sda3 lrwxrwxrwx 1 root root 10 Oct 3 15:04 45b0969e-9b03-4f30-b4c6-b4b80ceff106.a3fe4d60-ebe2-4829-8d0e-b12d50e089e2 -> ../../sda1 lrwxrwxrwx 1 root root 10 Oct 3 15:04 45b0969e-9b03-4f30-b4c6-b4b80ceff106.ed8e9371-f219-42be-a677-356dbe910097 -> ../../sda2 lrwxrwxrwx 1 root root 10 May 23 2016 4fbd7e29-9d25-41b8-afd0-062c0ceff05d.d5d2a10e-4b48-4e8e-a88f-19cc7b17f5ad -> ../../sdc1 # ceph-disk list /dev/sda : /dev/sda1 ceph journal, for /dev/sdc1 /dev/sda2 ceph journal /dev/sda3 other, xfs, mounted on /var/lib/ceph/osd/ceph-11 /dev/sdb : /dev/sdb1 other, xfs, mounted on /boot /dev/sdb2 other, LVM2_member /dev/sdb3 other, xfs, mounted on /var/lib/ceph/osd/ceph-10 /dev/sdc : /dev/sdc1 ceph data, active, cluster ceph, osd.9, journal /dev/sda1 /dev/sr0 other, unknown This is the symlink to the journal partition for osds. > hence ceph-disk failed. > > # ceph-disk -v activate-all > DEBUG:ceph-disk:Scanning /dev/disk/by-parttypeuuid > INFO:ceph-disk:Activating > /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c- > 40dc-a649-47b0d99a7baf > INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- > /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c- > 40dc-a649-47b0d99a7baf > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph > --name=osd. --lookup osd_mount_options_xfs > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph > --name=osd. --lookup osd_fs_mount_options_xfs > DEBUG:ceph-disk:Mounting > /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c- > 40dc-a649-47b0d99a7baf on /var/lib/ceph/tmp/mnt.DWsjYq with options > noatime,inode64 > INFO:ceph-disk:Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- > /dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.ec76bcd2-d53c- > 40dc-a649-47b0d99a7baf /var/lib/ceph/tmp/mnt.DWsjYq > INFO:ceph-disk:Running command: /usr/sbin/restorecon > /var/lib/ceph/tmp/mnt.DWsjYq > DEBUG:ceph-disk:Cluster uuid is 444f54b1-f97f-43d8-85b7-d5a02daac39a > INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph > --show-config-value=fsid > DEBUG:ceph-disk:Cluster name is ceph > DEBUG:ceph-disk:OSD uuid is ec76bcd2-d53c-40dc-a649-47b0d99a7baf > DEBUG:ceph-disk:OSD id is 30 > INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph > --name=osd. --lookup init > DEBUG:ceph-disk:Marking with init system sysvinit > DEBUG:ceph-disk:ceph osd.30 data dir is ready at /var/lib/ceph/tmp/mnt.DWsjYq > INFO:ceph-disk:/var/lib/ceph/osd/ceph-30 is not empty, won't override > ERROR:ceph-disk:Failed to activate > DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.DWsjYq > INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.DWsjYq > ceph-disk: Error: another ceph osd.30 already mounted in position > (old/different cluster instance?); unmounting ours. > ceph-disk: Error: One or more partitions failed to activate Right and this setup if different from the customer environments. The first customer who reported this issue is running encrypted OSDs. data disks: /dev/sdc1: UUID="358c9657-c1e7-4eac-9c2f-7b6a8f52bcad" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="bf0c758f-7bb4-4471-a4d0-10ce056cb825" /dev/sdd1: UUID="e62a4eea-44d0-467e-9bd5-afaad2e2917f" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="7c1d6551-2233-4b9a-94dc-27f04a9d1e07" /dev/sde1: UUID="660c310a-e944-4ed7-9a6b-760077808dbf" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="4e973be4-4c75-4a94-b6bd-d1034c21de7e" /dev/sdf1: UUID="9e72deee-1f7b-4065-8a47-9c43fb7b193a" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="bd310495-909c-4c32-8c06-5bd9a859b5f1" /dev/sdg1: UUID="f7914ee5-26d9-4dd0-a146-3e715ab09207" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="9608d14d-c484-4214-9c91-b8cc6d8c3e5d" /dev/sdh1: UUID="08efc109-2d55-4efe-9c3f-0b4d08ec91d7" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="26fe4375-4f17-4187-92c6-536803b2900a" /dev/sdi1: UUID="dac83874-a2fe-449c-9190-4680dc0d3139" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="a9d46267-533f-4605-a2cc-8d225b258819" /dev/sdj1: UUID="19a7cf86-31b8-4efc-a83b-3b2525fc4053" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="1bed3f52-d36d-459a-ada6-b37158987782" /dev/sdk1: UUID="dbe69cd2-601a-4905-b594-3d855520e90e" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="9a21d91c-b2db-4cf5-b592-f35f1c754999" /dev/sdl1: UUID="80109003-cd0f-409c-a9c9-34d1a07ace03" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="17c08c18-8808-4a56-81a0-d2f47ff9739f" /dev/sdm1: UUID="3e689239-0949-4986-8f8a-ad860b0d4077" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="8dd0414d-6a5e-4c74-aed0-3cd8d3fca68a" /dev/sdn1: UUID="5ff2aa5d-843d-49a6-893a-21d40c1682e7" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="b7e63d5b-d511-4b72-9458-c5d6905d1673" /dev/sdp1: UUID="18b8b4c1-8966-4035-b5c5-c3e2ad953914" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="ab375bfc-e5fd-4702-b110-e99405fc3b25" /dev/sdo1: UUID="58a4483d-b008-4112-b748-a18d5039579d" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="a23106d8-bf28-4bd6-995b-80c4b28a61d7" /dev/sdq1: UUID="c488b4ec-1c5c-46c2-8b0f-613b7c94c197" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="afe7df59-af5e-41d3-b5f8-1937c14621d3" /dev/sds1: UUID="442e4e20-7b49-45a3-ac7d-d8c52d4c2112" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="f93e8168-4428-4a24-928c-068a84c5625b" /dev/sdr1: UUID="7ec7075e-2886-4b57-85af-b157b591203c" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="80f72448-387a-4a2b-b553-16f844b57283" /dev/sdt1: UUID="a84fb5c4-e564-4d9e-81f0-cfea6435d696" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="ca3d9abd-0fb9-45de-a7d8-e734035c2234" /dev/sdu1: UUID="778a2285-c60d-4953-99d7-e0a8eea31ecb" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="56539e45-cb93-456d-aa9a-85762b1f821f" /dev/mapper/osd-29: UUID="32119356-8d68-4849-93d4-5a0fac953d1a" TYPE="xfs" /dev/mapper/osd-93: UUID="806f1422-d889-4a97-a7eb-4aedc9c2d7a4" TYPE="xfs" /dev/mapper/osd-148: UUID="bbebafb0-e615-4a09-a1ed-1157869fff77" TYPE="xfs" /dev/mapper/osd-85: UUID="b5cb4024-3033-437f-9f5c-7ce890f3d937" TYPE="xfs" /dev/mapper/osd-108: UUID="a29c2923-5c33-4e8b-8806-78df58455bda" TYPE="xfs" /dev/mapper/osd-99: UUID="4938c6f2-c3f0-4d81-8df9-64f0f8aee98d" TYPE="xfs" /dev/mapper/osd-133: UUID="f68ad6b5-3bf2-47ba-9891-479e4052fffc" TYPE="xfs" /dev/mapper/osd-20: UUID="b2fb3584-1b51-4fcb-91f6-831f2ec4acdc" TYPE="xfs" /dev/mapper/osd-124: UUID="f2d51ce5-0c82-4155-aebf-af5d2e89c65f" TYPE="xfs" /dev/mapper/osd-0: UUID="40fa897d-b43b-4d1b-ab0d-c16913838f4d" TYPE="xfs" /dev/mapper/osd-77: UUID="dba7c331-2d0a-496b-9d1e-dce41dbe7ffc" TYPE="xfs" /dev/mapper/osd-154: UUID="ed4e9b66-bb2e-4142-93b3-309cd0072f5f" TYPE="xfs" /dev/mapper/osd-38: UUID="08dbc954-e1b3-40ce-b47a-a461d0091e77" TYPE="xfs" /dev/mapper/osd-105: UUID="4ed80f18-cc3a-4f68-ac2b-6f802e50cabf" TYPE="xfs" /dev/mapper/osd-128: UUID="8c357f77-92c3-4305-9a14-fe1679f905be" TYPE="xfs" /dev/mapper/osd-15: UUID="fe7b0a5c-0c25-4220-b2f1-86524ee2195c" TYPE="xfs" /dev/mapper/osd-65: UUID="d0155357-37f4-4a9b-bd09-1ee2293eb672" TYPE="xfs" /dev/mapper/osd-141: UUID="01366708-51f4-42b5-8bdf-caedcb72fda1" TYPE="xfs" /dev/mapper/osd-51: UUID="0f8536b8-ad61-499d-ba29-e142676dbf4e" TYPE="xfs" /dev/mapper/osd-33: UUID="8b0ee655-d23e-4270-91b8-c939df439f6b" TYPE="xfs" Journal disks: /dev/sdv1: PARTLABEL="primary" PARTUUID="aece715b-8b88-40e5-a845-254cd6004e58" /dev/sdv2: PARTLABEL="extended" PARTUUID="f76caef2-11a9-4a3f-8b0a-515289b0809f" /dev/sdv3: PARTLABEL="extended" PARTUUID="a46fdd4c-cd0a-4fd7-ac6e-76a1a0ce2cf0" /dev/sdv4: PARTLABEL="extended" PARTUUID="d04b04b9-02d0-4538-9d69-4f2f66a929d6" /dev/sdv5: PARTLABEL="extended" PARTUUID="39d3fb93-28f8-4647-8165-b2f260dcb233" /dev/sdw1: PARTLABEL="primary" PARTUUID="63662a7d-b95d-4d92-b83a-ba6a03f3fe02" /dev/sdw2: PARTLABEL="extended" PARTUUID="e03a6e2c-e626-4e12-bc49-38e381b0e05f" /dev/sdw3: PARTLABEL="extended" PARTUUID="5f1c4814-c20e-40da-a8b0-6edf9a436915" /dev/sdw4: PARTLABEL="extended" PARTUUID="7f96a220-a43f-481c-ac95-e7f21caf47e0" /dev/sdw5: PARTLABEL="extended" PARTUUID="3c6fee3e-0625-45e1-9c9c-55f587b791da" /dev/sdx1: PARTLABEL="primary" PARTUUID="ef29c94b-a447-4d46-a1e2-3f44ddfaff8f" /dev/sdx2: PARTLABEL="extended" PARTUUID="92022058-ae86-4d5d-b138-889d2b422ee9" /dev/sdx3: PARTLABEL="extended" PARTUUID="dc5104d5-48b6-4ce8-b58e-68204e8a05c0" /dev/sdx4: PARTLABEL="extended" PARTUUID="1c9690aa-c05e-44b4-b13e-7c505d40689a" /dev/sdx5: PARTLABEL="extended" PARTUUID="bc256f85-7532-46c7-942c-78967ffb973a" /dev/sdy1: PARTLABEL="primary" PARTUUID="c90815e6-1bc4-43c9-a6cd-c4120fde7662" /dev/sdy2: PARTLABEL="extended" PARTUUID="e37d5c62-c33c-4529-b605-9a5801da05cb" /dev/sdy3: PARTLABEL="extended" PARTUUID="2da35649-f523-49dc-b1a3-35b0bb6dded8" /dev/sdy4: PARTLABEL="extended" PARTUUID="9ce6452b-3ee9-441c-ace1-796dc4bc52a3" /dev/sdy5: PARTLABEL="extended" PARTUUID="2df30063-5873-4e40-9f70-98ec57b5e684" # cat sos_commands/block/lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 136.7G 0 disk |-sda1 8:1 0 200M 0 part /boot/efi |-sda2 8:2 0 512M 0 part /boot |-sda3 8:3 0 54G 0 part | |-vg00alt-auditvol 253:2 0 256M 0 lvm | |-vg00alt-homevol 253:4 0 1G 0 lvm | |-vg00alt-rootvol 253:7 0 15G 0 lvm | |-vg00alt-tmpvol 253:8 0 5G 0 lvm | `-vg00alt-varvol 253:9 0 14G 0 lvm `-sda4 8:4 0 82G 0 part |-vg00-rootvol 253:0 0 15G 0 lvm / |-vg00-swapvol 253:1 0 2G 0 lvm [SWAP] |-vg00-homevol 253:3 0 1G 0 lvm /home |-vg00-tmpvol 253:5 0 5G 0 lvm /tmp |-vg00-auditvol 253:6 0 256M 0 lvm /var/log/audit |-vg00-crashvol 253:10 0 26G 0 lvm /var/crash `-vg00-varvol 253:11 0 14G 0 lvm /var sdb 8:16 0 1.7T 0 disk `-sdb1 8:17 0 1.7T 0 part `-osd-93 253:13 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-93 sdc 8:32 0 1.7T 0 disk `-sdc1 8:33 0 1.7T 0 part `-osd-99 253:17 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-99 sdd 8:48 0 1.7T 0 disk `-sdd1 8:49 0 1.7T 0 part `-osd-51 253:30 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-51 sde 8:64 0 1.7T 0 disk `-sde1 8:65 0 1.7T 0 part `-osd-65 253:28 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-65 sdf 8:80 0 1.7T 0 disk `-sdf1 8:81 0 1.7T 0 part `-osd-77 253:22 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-77 sdg 8:96 0 1.7T 0 disk `-sdg1 8:97 0 1.7T 0 part `-osd-85 253:15 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-85 sdh 8:112 0 1.7T 0 disk `-sdh1 8:113 0 1.7T 0 part `-osd-133 253:18 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-133 sdi 8:128 0 1.7T 0 disk `-sdi1 8:129 0 1.7T 0 part `-osd-141 253:29 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-141 sdj 8:144 0 1.7T 0 disk `-sdj1 8:145 0 1.7T 0 part `-osd-148 253:14 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-148 sdk 8:160 0 1.7T 0 disk `-sdk1 8:161 0 1.7T 0 part `-osd-154 253:23 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-154 sdl 8:176 0 1.7T 0 disk `-sdl1 8:177 0 1.7T 0 part `-osd-105 253:25 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-105 sdm 8:192 0 1.7T 0 disk `-sdm1 8:193 0 1.7T 0 part `-osd-108 253:16 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-108 sdn 8:208 0 1.7T 0 disk `-sdn1 8:209 0 1.7T 0 part `-osd-124 253:20 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-124 sdo 8:224 0 1.7T 0 disk `-sdo1 8:225 0 1.7T 0 part `-osd-128 253:26 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-128 sdp 8:240 0 1.7T 0 disk `-sdp1 8:241 0 1.7T 0 part `-osd-20 253:19 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-20 sdq 65:0 0 1.7T 0 disk `-sdq1 65:1 0 1.7T 0 part `-osd-29 253:12 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-29 sdr 65:16 0 1.7T 0 disk `-sdr1 65:17 0 1.7T 0 part `-osd-33 253:31 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-33 sds 65:32 0 1.7T 0 disk `-sds1 65:33 0 1.7T 0 part `-osd-38 253:24 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-38 sdt 65:48 0 1.7T 0 disk `-sdt1 65:49 0 1.7T 0 part `-osd-0 253:21 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-0 sdu 65:64 0 1.7T 0 disk `-sdu1 65:65 0 1.7T 0 part `-osd-15 253:27 0 1.7T 0 crypt /var/lib/ceph/osd/ceph-15 sdv 65:80 0 372.6G 0 disk |-sdv1 65:81 0 74.5G 0 part |-sdv2 65:82 0 74.5G 0 part |-sdv3 65:83 0 74.5G 0 part |-sdv4 65:84 0 74.5G 0 part `-sdv5 65:85 0 74.6G 0 part sdw 65:96 0 372.6G 0 disk |-sdw1 65:97 0 74.5G 0 part |-sdw2 65:98 0 74.5G 0 part |-sdw3 65:99 0 74.5G 0 part |-sdw4 65:100 0 74.5G 0 part `-sdw5 65:101 0 74.6G 0 part sdx 65:112 0 372.6G 0 disk |-sdx1 65:113 0 74.5G 0 part |-sdx2 65:114 0 74.5G 0 part |-sdx3 65:115 0 74.5G 0 part |-sdx4 65:116 0 74.5G 0 part `-sdx5 65:117 0 74.6G 0 part sdy 65:128 0 372.6G 0 disk |-sdy1 65:129 0 74.5G 0 part |-sdy2 65:130 0 74.5G 0 part |-sdy3 65:131 0 74.5G 0 part |-sdy4 65:132 0 74.5G 0 part `-sdy5 65:133 0 74.6G 0 part - The Second customer is using normal installation: # cat sos_commands/block/lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 111.8G 0 disk |-sda1 8:1 0 200M 0 part /boot/efi |-sda2 8:2 0 500M 0 part /boot `-sda3 8:3 0 111.1G 0 part |-rhel-swap 253:0 0 11.2G 0 lvm [SWAP] |-rhel-root 253:1 0 50G 0 lvm / `-rhel-home 253:2 0 49.9G 0 lvm /home sdb 8:16 0 186.3G 0 disk |-sdb1 8:17 0 9.3G 0 part |-sdb2 8:18 0 9.3G 0 part |-sdb3 8:19 0 9.3G 0 part |-sdb4 8:20 0 9.3G 0 part `-sdb5 8:21 0 9.3G 0 part sdc 8:32 0 186.3G 0 disk |-sdc1 8:33 0 9.3G 0 part |-sdc2 8:34 0 9.3G 0 part |-sdc3 8:35 0 9.3G 0 part |-sdc4 8:36 0 9.3G 0 part `-sdc5 8:37 0 9.3G 0 part sdd 8:48 0 1.8T 0 disk `-sdd1 8:49 0 1.8T 0 part /var/lib/ceph/osd/ceph-30 sde 8:64 0 1.8T 0 disk `-sde1 8:65 0 1.8T 0 part /var/lib/ceph/osd/ceph-31 sdf 8:80 0 1.8T 0 disk `-sdf1 8:81 0 1.8T 0 part /var/lib/ceph/osd/ceph-32 sdg 8:96 0 1.8T 0 disk `-sdg1 8:97 0 1.8T 0 part /var/lib/ceph/osd/ceph-33 sdh 8:112 0 1.8T 0 disk `-sdh1 8:113 0 1.8T 0 part /var/lib/ceph/osd/ceph-34 sdi 8:128 0 1.8T 0 disk `-sdi1 8:129 0 1.8T 0 part /var/lib/ceph/osd/ceph-35 sdj 8:144 0 1.8T 0 disk `-sdj1 8:145 0 1.8T 0 part /var/lib/ceph/osd/ceph-36 sdk 8:160 0 1.8T 0 disk `-sdk1 8:161 0 1.8T 0 part /var/lib/ceph/osd/ceph-37 sdl 8:176 0 1.8T 0 disk `-sdl1 8:177 0 1.8T 0 part /var/lib/ceph/osd/ceph-38 sdm 8:192 0 1.8T 0 disk `-sdm1 8:193 0 1.8T 0 part /var/lib/ceph/osd/ceph-39 sdn 8:208 0 7.4G 0 disk `-sdn1 8:209 0 7.4G 0 part sdo 8:224 0 1G 0 disk Journal disk 1: /dev/sdb1: PARTLABEL="primary" PARTUUID="adb985db-2b10-4311-a6be-b694ca8484ee" /dev/sdb2: PARTLABEL="primary" PARTUUID="b6a38b51-f3cc-4f6f-b1cf-52e786110a13" /dev/sdb3: PARTLABEL="primary" PARTUUID="3de8106a-56bc-4d69-a268-68f7eb180b22" /dev/sdb4: PARTLABEL="primary" PARTUUID="3bbfc0c5-0826-4725-86d1-9310bc8bf42d" /dev/sdb5: PARTLABEL="primary" PARTUUID="6a46003b-5e6e-43b4-b352-0cc92bbd7131" Journal disk 2: /dev/sdc1: PARTLABEL="primary" PARTUUID="5a71bc80-4b45-45f3-8888-e68f999ecbec" /dev/sdc2: PARTLABEL="primary" PARTUUID="6e9b67e0-dec6-44c2-9650-76f886a1d89f" /dev/sdc3: PARTLABEL="primary" PARTUUID="ea054dde-d772-459f-9d66-1c5fd8a879d7" /dev/sdc4: PARTLABEL="primary" PARTUUID="c0b5960f-4f71-4122-8b63-4c61cad94ff9" /dev/sdc5: PARTLABEL="primary" PARTUUID="6378fbd5-5beb-43b4-80bf-99c53969d56e" OSD data disks: /dev/sdd1: UUID="02b1cb56-efb9-44c2-8fec-a5f6bf5c2f26" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="d790041f-d7ea-4c03-a01f-9801e2c23f6d" /dev/sde1: UUID="8152faa6-f8b9-4637-b378-a773be5c0ed2" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="6871b2bb-6d05-4e7f-b9e9-5b95d1c44dd1" /dev/sdf1: UUID="4b75f68b-d241-47b1-a14b-1ca6a79af778" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="8d9a27a3-eab8-459a-9e5b-bfb68b2ce789" /dev/sdg1: UUID="cd4cc6d8-6cbb-4a24-b4da-8d11b3839c6d" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="357c8863-d51f-459a-88be-22afdfb13cde" /dev/sdh1: UUID="9f7a3496-b31c-4ec9-be41-271bb27dd2bc" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="19554bdf-055c-4520-9270-c3e845cf153e" /dev/sdi1: UUID="44d875bb-3ad8-4728-aa6e-6b8c6d647a42" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="33a00fd9-3ca1-4e2e-836a-8469486e92fc" /dev/sdl1: UUID="8b1926a3-8bf7-4eaf-8d2a-d2430e30563b" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="817c14da-77bf-4625-ba6d-00991d48bf32" /dev/sdk1: UUID="61b08eb1-f2b6-4eba-a08f-992b0eabee7b" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="7813779a-c44d-4383-a88e-1ab39f017caa" /dev/sdj1: UUID="caf8c7a4-de7c-42a7-a032-d194e299e421" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="b028cbfe-1413-4269-a66b-7da0fdc6625c" /dev/sdm1: UUID="0ecf229b-4120-4a85-b36b-0d8b11e293ca" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="2fabe133-6c9e-4aae-b8ff-9073a2a8c81f" - And our testbed is neraly equal to the second customer but one OSD in our testbed is co-located in journal disk as SSD disk osd. [root@magna077 ubuntu]# pgrep osd -a 4755 /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph -f 5551 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f 6000 /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f # ======= stop osd.0 [root@magna077 ubuntu]# service ceph stop osd.0 === osd.0 === Stopping Ceph osd.0 on magna077...kill 4755...kill 4755...done # ======= and osd.0 is stopped, osd.1 and osd.2 are still up and running [root@magna077 ubuntu]# service ceph status === osd.2 === osd.2: running {"version":"0.94.10-2.el7cp"} === osd.0 === osd.0: not running. === osd.1 === osd.1: running {"version":"0.94.10-2.el7cp"} [root@magna077 ubuntu]# pgrep osd -a 5551 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f 6000 /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f # ======= and start osd.0 [root@magna077 ubuntu]# service ceph start osd.0 === osd.0 === ERROR:calamari_osd_location:Error 1 running ceph config-key get:'2018-03-15 08:37:02.402916 7f3d0ebb3700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication 2018-03-15 08:37:02.402919 7f3d0ebb3700 0 librados: client.admin initialization error (2) No such file or directory Error connecting to cluster: ObjectNotFound' create-or-move updated item name 'osd.0' weight 0.9 at location {host=magna077} to crush map Starting Ceph osd.0 on magna077... Running as unit ceph-osd.0.1521103018.987203963.service. # ===== all OSDs are running, including osd.1 and osd.2 [root@magna077 ubuntu]# pgrep osd -a 5551 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f 5937 /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph -f 6000 /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f [root@magna077 ubuntu]# service ceph status === osd.2 === osd.2: running {"version":"0.94.10-2.el7cp"} === osd.0 === osd.0: running {"version":"0.94.10-2.el7cp"} === osd.1 === osd.1: running {"version":"0.94.10-2.el7cp"} @Manohar could you share with me the step to reproduce? or am i missing something? after applying the fix of #1275636, the the SubState of ceph.service after rebooting is still "exited". i think it's related to how systemd understands the lifecycle of a SysV/LSB service. for a none native systemd service, systemctl cannot know if it is running or not. that's why the SubState of ceph.service is "exited" once the sysv script exits. and we are using a single /etc/rc.d/init.d/ceph for managing all ceph services, so we cannot use the "pidfile:"[1] tag to help systemd-sysv-generator to generate a PIDFile line in /run/systemd/generator.late/ceph.service . so i'd suggest continue using the workaround documented at https://access.redhat.com/solutions/2877891 or just use ceph-osd@ service directly. for instance, "systemctl start ceph-osd@0" --- [1] https://www.freedesktop.org/wiki/Software/systemd/Incompatibilities/ the reason why we are able to start other osds after they are killed by "service ceph start osd.0" after reboot, is that the SubState is set to "dead" by the first "service ceph start osd.0" command. it calls "/bin/systemctl stop ${SERVICE}.service" for reaping the "died" ceph SysV daemons. that command changes the state of ceph.service to: # systemctl show -p ActiveState -p SubState ceph ActiveState=inactive SubState=dead |