My typo, the last sentence of comment #1 should be: The issue happen with A+B+C and B+C, not repro for all the others combinations.
I know we've seen this behavior once before, is this something that occurs every time we run Aggregated Logging 3.2.1 on an OSE 3.3 master? We've removed triggers in the 3.3 DCs, so that may be why we don't see this happening then. Can we check if using Aggregated Logging 3.3 with scenarios B+C causes issues? You'll probably need to update the image versions of the IS and pull it in too.
I can reproduce this with 3.2.1 and 3.2.0 logging deploying on OSE 3.3.0. I suspect that the two ImageChange triggers are both attempting a deploy at the same time and tripping over each other. I'm seeing errors like this in the journal: ontroller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:404] found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found controller.go:404] found previous inflight deployment for logging/logging-kibana - requeuing replication_controller.go:498] Too many "logging"/"logging-kibana-97" replicas, need 0, deleting 1 controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing event.go:216] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"logging", Name:"logging-kibana-97", UID:"9521e1e7-6323-11e6-976c-5254002ddfb5", APIVersion:"v1", ResourceVersion:"271783", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: logging-kibana-97-qkm0a controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing controller.go:404] found previous inflight deployment for logging/logging-kibana - requeuing This may not actually be a problem for customers as I think the problem is only when the images are imported and kick off the deploy. I think the path of install 3.2 -> upgrade OSE -> upgrade logging will probably work fine (though this would be good to test). However this is a change in behavior and I think it's worth having the platform team take a look at it and see if they can make it behave better.
This is pretty serious - it could certainly explain bugs we've seen. It should resolve all triggers at once.
(In reply to ewolinet from comment #3) > I know we've seen this behavior once before, is this something that occurs > every time we run Aggregated Logging 3.2.1 on an OSE 3.3 master? Yes, currently the reproducibility is 100% to me. > We've removed triggers in the 3.3 DCs, so that may be why we don't see this > happening then. Can we check if using Aggregated Logging 3.3 with scenarios > B+C causes issues? You'll probably need to update the image versions of the > IS and pull it in too. Yes, I placed image triggers B+C with image versions = 3.3.0 in logging 3.3.0, issue can be reproducible there.
Fix here: https://github.com/openshift/origin/pull/10444
Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/90fc4171e146646eb38b7973fc20ae49b84eafb8 Bug 1366936: fix ICT matching in the trigger controller
Commit from Comment 8 has been merged into OSE. This has been merged into ose and is in OSE v3.3.0.23 or newer.
Verified on openshift v3.3.0.23, the 3.2.1 level logging stacks are stable and working fine there: $ oc get po NAME READY STATUS RESTARTS AGE logging-deployer-0n8h5 0/1 Completed 0 4m logging-es-7igyqphg-1-2muff 1/1 Running 0 2m logging-fluentd-1-ixyt2 1/1 Running 0 2m logging-kibana-1-vrg6o 2/2 Running 0 2m # openshift version openshift v3.3.0.23-dirty kubernetes v1.3.0+507d3a7 etcd 2.3.0+git
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933