Bug 1780688 - /etc/systemd/system/ceph-osd@.service contain the wrong OSD container names
Summary: /etc/systemd/system/ceph-osd@.service contain the wrong OSD container names
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z2
: 3.3
Assignee: Dimitri Savineau
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks: 1578730
TreeView+ depends on / blocked
 
Reported: 2019-12-06 16:03 UTC by Matt Flusche
Modified: 2019-12-19 17:59 UTC (History)
10 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.38-1.el7cp Ubuntu: ceph-ansible_3.2.38-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-19 17:58:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 4834 0 'None' closed ceph-osd: update systemd unit script 2021-01-18 14:15:27 UTC
Red Hat Product Errata RHSA-2019:4353 0 None None None 2019-12-19 17:59:02 UTC

Description Matt Flusche 2019-12-06 16:03:21 UTC
Description of problem:

OSP 13 deployment managing containerized ceph.

The service file: /etc/systemd/system/ceph-osd\@.service contains the wrong container name format.

ceph-osd-overcloud-cephstorage-0-%i

vs actual container name.

ceph-osd-%i

This is a filestore deployment non-colocated if that makes a difference.

ceph-ansible-3.2.30.1-1.el7cp.noarch

# cat /etc/systemd/system/ceph-osd\@.service 
# Please do not change this file directly since it is managed by Ansible and will be overwritten
[Unit]
Description=Ceph OSD
After=docker.service

[Service]
EnvironmentFile=-/etc/environment
ExecStartPre=-/usr/bin/docker stop ceph-osd-overcloud-cephstorage-0-%i
ExecStartPre=-/usr/bin/docker rm -f ceph-osd-overcloud-cephstorage-0-%i
ExecStart=/usr/share/ceph-osd-run.sh %i
ExecStop=-/usr/bin/docker stop ceph-osd-overcloud-cephstorage-0-%i
Restart=always
RestartSec=10s
TimeoutStartSec=120
TimeoutStopSec=15

[Install]
WantedBy=multi-user.target


Where the containers are named:

# docker ps
CONTAINER ID        IMAGE                                             COMMAND                  CREATED             STATUS              PORTS               NAMES
a29a9302b27b        172.16.6.1:8787/rhceph/rhceph-3-rhel7:3-37        "/entrypoint.sh"         19 hours ago        Up 19 hours                             ceph-osd-1
f9ce7a071301        172.16.6.1:8787/rhceph/rhceph-3-rhel7:3-37        "/entrypoint.sh"         2 days ago          Up 2 days                               ceph-osd-12
bc7c5ca9157c        172.16.6.1:8787/rhceph/rhceph-3-rhel7:3-37        "/entrypoint.sh"         2 days ago          Up 2 days                               ceph-osd-9
976d58d8387a        172.16.6.1:8787/rhceph/rhceph-3-rhel7:3-37        "/entrypoint.sh"         2 days ago          Up 2 days                               ceph-osd-6
9854a4a629ec        172.16.6.1:8787/rhceph/rhceph-3-rhel7:3-37        "/entrypoint.sh"         2 days ago          Up 2 days                               ceph-osd-3
9967ecd9f026        172.16.6.1:8787/rhosp13/openstack-cron:13.0-102   "dumb-init --singl..."   3 days ago          Up 3 days                               logrotate_crond

# grep name /usr/share/ceph-osd-run.sh                                                                                                             
  --name=ceph-osd-"$1" \


Version-Release number of selected component (if applicable):
ceph-ansible-3.2.30.1-1.el7cp.noarch

OSP 13 deployment

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP 13 with ceph and non-colocated filestore OSD disks

The OSDs still start up normal but the following are seen in the logs:

# systemctl stop ceph-osd
# systemctl start ceph-osd

# tail -1000 /var/log/messages |grep 'No such container'
Dec  6 09:58:07 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:07.470652767-06:00" level=error msg="Handler for POST /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9/stop returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:07 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:07.473639251-06:00" level=error msg="Handler for POST /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9/stop returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:07 overcloud-cephstorage-0 docker: Error response from daemon: No such container: ceph-osd-overcloud-cephstorage-0-9
Dec  6 09:58:19 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:19.271731266-06:00" level=error msg="Handler for POST /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9/stop returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:19 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:19.273934833-06:00" level=error msg="Handler for POST /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9/stop returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:19 overcloud-cephstorage-0 docker: Error response from daemon: No such container: ceph-osd-overcloud-cephstorage-0-9
Dec  6 09:58:19 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:19.314878544-06:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9?force=1 returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:19 overcloud-cephstorage-0 dockerd-current: time="2019-12-06T09:58:19.316382848-06:00" level=error msg="Handler for DELETE /v1.26/containers/ceph-osd-overcloud-cephstorage-0-9 returned error: No such container: ceph-osd-overcloud-cephstorage-0-9"
Dec  6 09:58:19 overcloud-cephstorage-0 docker: Error response from daemon: No such container: ceph-osd-overcloud-cephstorage-0-9

Comment 3 Matt Flusche 2019-12-09 14:53:23 UTC
I'm doing another deployment to provided the requested information.

Note:  This seems to only impact filestore deployments.  For bluestore, /etc/systemd/system/ceph-osd\@.service was correct.

Comment 7 Dimitri Savineau 2019-12-09 21:15:53 UTC
> Note:  This seems to only impact filestore deployments.  For bluestore, /etc/systemd/system/ceph-osd\@.service was correct.

This should be the same with filestore and bluestore but limited to the collocated/non-colocated osd scenarios.

The lvm osd scenario is not impacted.

Comment 13 Yogev Rabl 2019-12-17 21:24:39 UTC
Verified on ceph-ansible-3.2.38-1.el7cp

Comment 14 errata-xmlrpc 2019-12-19 17:58:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:4353


Note You need to log in before you can comment on or make changes to this bug.