Bug 1493298 - OSP11 -> OSP12 upgrade: swift_rsync container on controller nodes is in Restarting state post upgrade
Summary: OSP11 -> OSP12 upgrade: swift_rsync container on controller nodes is in Resta...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: 12.0 (Pike)
Assignee: Christian Schwede (cschwede)
QA Contact: Mike Abrams
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-19 20:51 UTC by Marius Cornea
Modified: 2018-02-05 19:15 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-7.0.3-11.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-13 22:10:20 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1718403 0 None None None 2017-09-20 11:50:26 UTC
OpenStack gerrit 515448 0 'None' MERGED Remove rsync from xinetd when upgrading to containerized deployment 2020-07-15 08:24:01 UTC
Red Hat Product Errata RHEA-2017:3462 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-16 01:43:25 UTC

Description Marius Cornea 2017-09-19 20:51:54 UTC
Description of problem:
OSP11 -> OSP12 upgrade: swift_rsync container running on controller nodes is in Restarting state post upgrade

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.0-0.20170913050523.0rc2.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy monolithic OSP11 with 3 controllers, 2 compute, 3 ceph nodes
2. Upgrade environment to OSP12
3. Log in to one of the controllers and check swift_rsync container status:

sudo docker inspect --format="{{.State.Status }}" swift_rsync

Actual results:
restarting

Expected results:
running

Additional info:

[root@controller-0 ~]# docker ps | grep swift_rsync
1a6e2ef615b2        192.168.24.1:8787/rhosp12/openstack-swift-object-docker:2017-09-15.1         "kolla_start"            3 hours ago         Restarting (10) About an hour ago                       swift_rsync


Output of docker logs swift_rsync:
http://paste.openstack.org/show/621479/

Comment 3 Christian Schwede (cschwede) 2017-09-20 11:50:04 UTC
Thanks Marius, I was able to find the reason for this.

After upgrading from a non-containerized deployment to a containerized xinetd is still running, and using port 873 (rsync). Thus the swift_rsync container can't start, because the port is still in use.

Therefore the upgrade tasks needs to stop the xinetd service as well.

Proposed patch: https://review.openstack.org/#/c/505606/
Upstream bug report: https://bugs.launchpad.net/tripleo/+bug/1718403

Comment 4 Christian Schwede (cschwede) 2017-10-25 07:13:33 UTC
Upstream patch merged, moving to POST.

Comment 6 Jon Schlueter 2017-11-01 20:41:44 UTC
stable/pike cherry-pick is proposed but not yet landed

Comment 7 Christian Schwede (cschwede) 2017-11-15 12:34:30 UTC
Upstream backport just merged.

Comment 8 Jon Schlueter 2017-11-21 21:25:44 UTC
openstack-tripleo-heat-templates-7.0.3-11.el7ost

Comment 14 Marius Cornea 2017-11-26 13:16:04 UTC
Mike,

The converge step basically does a stack update the nova upgrade_levels. At this point the services have been upgraded and migrated into containers. If the swift_rsync container gets into Restarting state at that point I suspect the same issue would show up while doing a stack update of a fresh OSP12 deployment so it's probably not related to the patch which addresses BZ#1493298. Checking the logs on your machine we can see:

[root@controller-0 heat-admin]# docker logs --tail 5 swift_rsync
INFO:__main__:Deleting /etc/rsyncd.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rsyncd.conf to /etc/rsyncd.conf
INFO:__main__:Writing out command to execute
failed to create pid file /var/run/rsyncd.pid: File exists
Running command: '/usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf'

After removing the existing rsync pid file the container is able to start:
[root@controller-0 heat-admin]# mv /var/run/rsyncd.pid /var/run/rsyncd.pid.orig
[root@controller-0 heat-admin]# docker restart swift_rsync
swift_rsync
[root@controller-0 heat-admin]# docker ps | grep swift_rsync
a2e2c07ef6e8        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-swift-object-docker:20171122.1              "kolla_start"            18 minutes ago      Up About a minute (healthy)                       swift_rsync

Based on this data I'd say this is a new issue which is not related to the initial report in 1493298 where you can see the log info for the restarting container is different.

Comment 19 errata-xmlrpc 2017-12-13 22:10:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462


Note You need to log in before you can comment on or make changes to this bug.