Bug 1493298

Summary:	OSP11 -> OSP12 upgrade: swift_rsync container on controller nodes is in Restarting state post upgrade
Product:	Red Hat OpenStack	Reporter:	Marius Cornea <mcornea>
Component:	openstack-tripleo-heat-templates	Assignee:	Christian Schwede (cschwede) <cschwede>
Status:	CLOSED ERRATA	QA Contact:	Mike Abrams <mabrams>
Severity:	urgent	Docs Contact:
Priority:	high
Version:	12.0 (Pike)	CC:	cschwede, dbecker, jschluet, mburns, mcornea, morazi, pgrist, rhel-osp-director-maint, scohen, thiago
Target Milestone:	rc	Keywords:	Triaged
Target Release:	12.0 (Pike)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-tripleo-heat-templates-7.0.3-11.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-13 22:10:20 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Marius Cornea 2017-09-19 20:51:54 UTC

Description of problem:
OSP11 -> OSP12 upgrade: swift_rsync container running on controller nodes is in Restarting state post upgrade

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.0-0.20170913050523.0rc2.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy monolithic OSP11 with 3 controllers, 2 compute, 3 ceph nodes
2. Upgrade environment to OSP12
3. Log in to one of the controllers and check swift_rsync container status:

sudo docker inspect --format="{{.State.Status }}" swift_rsync

Actual results:
restarting

Expected results:
running

Additional info:

[root@controller-0 ~]# docker ps | grep swift_rsync
1a6e2ef615b2        192.168.24.1:8787/rhosp12/openstack-swift-object-docker:2017-09-15.1         "kolla_start"            3 hours ago         Restarting (10) About an hour ago                       swift_rsync


Output of docker logs swift_rsync:
http://paste.openstack.org/show/621479/

Comment 3 Christian Schwede (cschwede) 2017-09-20 11:50:04 UTC

Thanks Marius, I was able to find the reason for this.

After upgrading from a non-containerized deployment to a containerized xinetd is still running, and using port 873 (rsync). Thus the swift_rsync container can't start, because the port is still in use.

Therefore the upgrade tasks needs to stop the xinetd service as well.

Proposed patch: https://review.openstack.org/#/c/505606/
Upstream bug report: https://bugs.launchpad.net/tripleo/+bug/1718403

Comment 4 Christian Schwede (cschwede) 2017-10-25 07:13:33 UTC

Upstream patch merged, moving to POST.

Comment 6 Jon Schlueter 2017-11-01 20:41:44 UTC

stable/pike cherry-pick is proposed but not yet landed

Comment 7 Christian Schwede (cschwede) 2017-11-15 12:34:30 UTC

Upstream backport just merged.

Comment 8 Jon Schlueter 2017-11-21 21:25:44 UTC

openstack-tripleo-heat-templates-7.0.3-11.el7ost

Comment 14 Marius Cornea 2017-11-26 13:16:04 UTC

Mike,

The converge step basically does a stack update the nova upgrade_levels. At this point the services have been upgraded and migrated into containers. If the swift_rsync container gets into Restarting state at that point I suspect the same issue would show up while doing a stack update of a fresh OSP12 deployment so it's probably not related to the patch which addresses BZ#1493298. Checking the logs on your machine we can see:

[root@controller-0 heat-admin]# docker logs --tail 5 swift_rsync
INFO:__main__:Deleting /etc/rsyncd.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/rsyncd.conf to /etc/rsyncd.conf
INFO:__main__:Writing out command to execute
failed to create pid file /var/run/rsyncd.pid: File exists
Running command: '/usr/bin/rsync --daemon --no-detach --config=/etc/rsyncd.conf'

After removing the existing rsync pid file the container is able to start:
[root@controller-0 heat-admin]# mv /var/run/rsyncd.pid /var/run/rsyncd.pid.orig
[root@controller-0 heat-admin]# docker restart swift_rsync
swift_rsync
[root@controller-0 heat-admin]# docker ps | grep swift_rsync
a2e2c07ef6e8        rhos-qe-mirror-tlv.usersys.redhat.com:5000/rhosp12/openstack-swift-object-docker:20171122.1              "kolla_start"            18 minutes ago      Up About a minute (healthy)                       swift_rsync

Based on this data I'd say this is a new issue which is not related to the initial report in 1493298 where you can see the log info for the restarting container is different.

Comment 19 errata-xmlrpc 2017-12-13 22:10:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462