Bug 1674517 - Paunch does not start stopped containers
Summary: Paunch does not start stopped containers
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-paunch
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Luke Short
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-11 14:23 UTC by Lukas Bezdicka
Modified: 2020-02-05 21:40 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-05 21:40:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 642910 0 'None' MERGED Handle defined containers that are stopped. 2021-01-13 20:43:59 UTC

Description Lukas Bezdicka 2019-02-11 14:23:58 UTC
During minor update of OSP13 I noticed that containers that I manually stopped weren't started back up. This means that if for any reason container is stopped paunch will not start it back up.

Comment 1 Steve Baker 2019-02-11 22:21:47 UTC
Paunch does not manage the lifecycle of containers. For OSP13 and OSP14 the docker service manages the lifecycle of containers via the restart policy set in the paunch config.

By manually stopping the containers did you change the restart policy? doing a "docker inspect <container>" should show what state it is in.

For OSP15 paunch writes out systemd unit files which manage containers via podman.

Comment 2 Lukas Bezdicka 2019-02-12 10:38:00 UTC
I'm sorry my description wasn't good. Issue is:

For any reason "docker stop <service>" like memcached happens the subsequent stack update for scale up, minor update or upgrade will not start the container back up. This is because we check in paunch if container exists (docker ps -a) to decide whether we should start it up or not.

Comment 3 Jiri Stransky 2019-02-18 14:03:19 UTC
Just to clarify further, this is not about lifecycle, but about re-asserting the state via Paunch, which doesn't seem to work as we'd expect:

1) Stop container

2) Run paunch, which has the container defined

3) Expecting the container to be present and running (paunch asserting the defined state), but the container is still stopped


If in step 1 we'd delete the container instead of stopping it, then step 2 would start the container. It's counter-intuitive that after deleting the container, Paunch re-asserts the state to match what's defined in config files, but after stopping the container it doesn't.

In theory this isn't only about updates. The `overcloud deploy` action should ideally put the overcloud into state as defined by t-h-t. So if the user manually stopped some containers, i'd expect them to get started. E.g. Puppet or Ansible would similarly re-assert that services are running.

Does starting the containers when they are stopped make sense within Paunch scope? I think addressing it anywhere else would be quite hacky.

Comment 4 Sofer Athlan-Guyot 2019-02-25 17:19:46 UTC
Re-assigning to dfg:df to foster the discussion.

Comment 5 Steve Baker 2019-02-25 20:21:46 UTC
Yes, I think paunch should delete stopped containers which are expected to be running, then recreate them. We'll discuss at the triage meeting who will be assigned to this.

Comment 6 Luke Short 2019-03-04 20:11:36 UTC
There appears to be an upstream bug for querying for containers in the `stopped` state with podman. I have opened an upstream bug about this here: https://github.com/containers/libpod/issues/2526

Comment 7 Luke Short 2019-03-25 16:07:49 UTC
Steve has shown me an alternative way to query for stopped containers. We are working on getting the new code and tests merged in.


Note You need to log in before you can comment on or make changes to this bug.