Bug 1589346

Summary: Minor Update runs common_deploy_steps_tasks.yaml twice
Product: Red Hat OpenStack Reporter: Tim Rozet <trozet>
Component: openstack-tripleo-heat-templatesAssignee: Jiri Stransky <jstransk>
Status: CLOSED ERRATA QA Contact: Raviv Bar-Tal <rbartal>
Severity: high Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: ccamacho, dbecker, jchhatba, jstransk, lmarsh, mandreou, mburns, mcornea, morazi, rbartal
Target Milestone: z1Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-tripleoclient-9.2.1-13.el7ost openstack-tripleo-heat-templates-8.0.2-43.el7ost Doc Type: Bug Fix
Doc Text:
Service deployment tasks within the minor-update workflow were run twice caused by superfluous entries in the list of playbooks. This update removes the superfluous playbook entries and includes host preparation tasks directly in the updated playbook. Actions in minor version updates run once in the desired order.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-19 14:27:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tim Rozet 2018-06-08 18:21:30 UTC
Description of problem:
When issuing minor update in OSP13, there are multiple starts of the service container during update.  For example, looking at docker logs for opendaylight_api container:

Jun 08 13:25:01 controller-1 dockerd-current[19336]: + sudo -E kolla_set_configs
Jun 08 16:54:41 controller-1 dockerd-current[19336]: + sudo -E kolla_set_configs
Jun 08 17:41:42 controller-1 dockerd-current[19336]: + sudo -E kolla_set_configs

We can see that the first iteration is the original deployment, while the next 2 are part of the update.  Note, these do not include puppet containers.  Therefore something is triggering the container to go down then come back up a 2nd time during update.

Comment 1 Tim Rozet 2018-06-08 18:42:33 UTC
Jiri suggested that:
<jistr> so what i *think* right now, with the double-execution issue, it probably looks like:
<jistr> 1) update_tasks
<jistr> 2) deploy tasks (puppet, containers)
<jistr> 3) host_prep_tasks
<jistr> 4) deploy tasks again

This issue looks to be causing an issue with ODL update:
https://bugzilla.redhat.com/show_bug.cgi?id=1586171#c23

I'm trying to workaround it as mentioned in the above comment.

Comment 2 Jiri Stransky 2018-06-08 20:35:52 UTC
Based on comments at:

https://bugzilla.redhat.com/show_bug.cgi?id=1586171#c24
https://bugzilla.redhat.com/show_bug.cgi?id=1586171#c25

fixing this bug would likely not fix bug 1586171 fully, and we haven't observed any other breakages due to starting the containers twice, so i'd like to lower the severity rating for now. In case my assessment isn't accurate, feel free to re-triage.

Either way i'd still like to investigate/fix this soon.

Comment 4 Tim Rozet 2018-06-09 02:05:07 UTC
Hi Jiri, I prioritized it as high not because of bug 158671, but because it adds an extra 45min to the update process from what I can see in the logs.  I'm fine with a fix in Z.

Comment 5 Tim Rozet 2018-06-09 02:05:44 UTC
Sorry entered wrong bug in previous comment, bug 1586171

Comment 6 Jiri Stransky 2018-06-11 10:59:11 UTC
Ack, i didn't realize it would add that much time, i'll move the priority back to high. Also linking upstream bug report.

Comment 11 Jiri Stransky 2018-06-18 12:46:26 UTC
*** Bug 1590597 has been marked as a duplicate of this bug. ***

Comment 12 Jiri Stransky 2018-06-29 14:21:50 UTC
Merged to stable/queens.

Comment 24 errata-xmlrpc 2018-07-19 14:27:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2214