Description of problem: A patch for bug #1877815 blocks minor upgrade procedures for RHOSP 13 clusters with TripleO-controlled Ceph if new docker RPM is available. A customer reported that minor upgrade procedure is not working well: it fails because of error [1] on some controller node. The fix is simple: to manually start appropriate services and repeat minor upgrade procedure; next run would fail on next controller node. As a result, this problem is not complete blocker, but at the same time it requires 4 executions of "openstack overcloud update run" command. Customer provided sosreports from affected controller node, director node and also provided complete set of mistral logs. From logs it looks like ceph-mon container (and other docker containers) was originally stopped by "Stop docker" play, so "Double check the mon systemd unit is not consistent with the current mon" finds out that ceph-mon unit is not running and ansible runs "Stop mons to make them consistent with systemd" play, which fails because there is not appropriate container. Again, full set of logs is available in provided case's attachements. [1] 2020-12-29 16:10:52,071 p=12272 u=mistral | fatal: [controller01]: FAILED! => {"changed": true, "cmd": "docker stop ceph-mon-controller01", "delta": "0:00:00.032422", "end": "2020-12-29 16:10:52.048778", "msg": "non-zero return code", "rc": 1, "start": "2020-12-29 16:10:52.016356", "stderr": "Error response from daemon: No such container: ceph-mon-controller01", "stderr_lines": ["Error response from daemon: No such container: ceph-mon-controller01"], "stdout": "", "stdout_lines": []}
Please use https://access.redhat.com/solutions/5679791 to workaround this issue. This looks like a duplicate of bug 1910842.