Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1366936 - The 3.2.1 level kibana pod kept on redeploying itself when logging is deployed on OSE 3.3.0 master
The 3.2.1 level kibana pod kept on redeploying itself when logging is deploye...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.3.0
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Michail Kargakis
chunchen
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-08-14 23:20 EDT by Xia Zhao
Modified: 2017-03-08 13 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The trigger controller we use for handling triggers for deployments was not handling correctly ImageChangeTriggers from different namespaces, resulting in hotlooping between deployments.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-27 05:44:02 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github openshift/origin/pull/10444 None None None 2016-08-17 16:14 EDT
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 09:24:36 EDT

  None (edit)
Comment 2 Xia Zhao 2016-08-14 23:27:10 EDT
My typo, the last sentence of comment #1 should be:
The issue happen with A+B+C and B+C, not repro for all the others combinations.
Comment 3 ewolinet 2016-08-15 10:05:34 EDT
I know we've seen this behavior once before, is this something that occurs every time we run Aggregated Logging 3.2.1 on an OSE 3.3 master?

We've removed triggers in the 3.3 DCs, so that may be why we don't see this happening then. Can we check if using Aggregated Logging 3.3 with scenarios B+C causes issues? You'll probably need to update the image versions of the IS and pull it in too.
Comment 4 Luke Meyer 2016-08-15 16:13:02 EDT
I can reproduce this with 3.2.1 and 3.2.0 logging deploying on OSE 3.3.0. I suspect that the two ImageChange triggers are both attempting a deploy at the same time and tripping over each other. I'm seeing errors like this in the journal:

ontroller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:404] found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found
controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found
controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found
controller.go:404] found previous inflight deployment for logging/logging-kibana - requeuing
replication_controller.go:498] Too many "logging"/"logging-kibana-97" replicas, need 0, deleting 1
controller.go:399] Error syncing deployment config logging/logging-kibana-ops: found previous inflight deployment for logging/logging-kibana-ops - requeuing
event.go:216] Event(api.ObjectReference{Kind:"ReplicationController", Namespace:"logging", Name:"logging-kibana-97", UID:"9521e1e7-6323-11e6-976c-5254002ddfb5", APIVersion:"v1", ResourceVersion:"271783", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: logging-kibana-97-qkm0a
controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing
controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing
controller.go:226] Error instantiating deployment config logging/logging-kibana-ops: couldn't retrieve deployment for deployment config "logging/logging-kibana-ops": replicationcontrollers "logging-kibana-ops-91" not found
controller.go:399] Error syncing deployment config logging/logging-kibana: found previous inflight deployment for logging/logging-kibana - requeuing
controller.go:404] found previous inflight deployment for logging/logging-kibana - requeuing

This may not actually be a problem for customers as I think the problem is only when the images are imported and kick off the deploy. I think the path of install 3.2 -> upgrade OSE -> upgrade logging will probably work fine (though this would be good to test). However this is a change in behavior and I think it's worth having the platform team take a look at it and see if they can make it behave better.
Comment 5 Clayton Coleman 2016-08-15 17:32:46 EDT
This is pretty serious - it could certainly explain bugs we've seen.  It should resolve all triggers at once.
Comment 6 Xia Zhao 2016-08-16 03:00:19 EDT
(In reply to ewolinet from comment #3)
> I know we've seen this behavior once before, is this something that occurs
> every time we run Aggregated Logging 3.2.1 on an OSE 3.3 master?

Yes, currently the reproducibility is 100% to me.

> We've removed triggers in the 3.3 DCs, so that may be why we don't see this
> happening then. Can we check if using Aggregated Logging 3.3 with scenarios
> B+C causes issues? You'll probably need to update the image versions of the
> IS and pull it in too.

Yes, I placed image triggers B+C with image versions = 3.3.0 in logging 3.3.0, issue can be reproducible there.
Comment 7 Michal Fojtik 2016-08-17 10:23:47 EDT
Fix here: https://github.com/openshift/origin/pull/10444
Comment 8 openshift-github-bot 2016-08-17 17:10:07 EDT
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/90fc4171e146646eb38b7973fc20ae49b84eafb8
Bug 1366936: fix ICT matching in the trigger controller
Comment 12 Troy Dawson 2016-08-19 16:55:46 EDT
Commit from Comment 8 has been merged into OSE.
This has been merged into ose and is in OSE v3.3.0.23 or newer.
Comment 14 Xia Zhao 2016-08-23 00:34:24 EDT
Verified on openshift v3.3.0.23, the 3.2.1 level logging stacks are stable and working fine there:

$ oc get po
NAME                          READY     STATUS      RESTARTS   AGE
logging-deployer-0n8h5        0/1       Completed   0          4m
logging-es-7igyqphg-1-2muff   1/1       Running     0          2m
logging-fluentd-1-ixyt2       1/1       Running     0          2m
logging-kibana-1-vrg6o        2/2       Running     0          2m

# openshift version
openshift v3.3.0.23-dirty
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git
Comment 16 errata-xmlrpc 2016-09-27 05:44:02 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Note You need to log in before you can comment on or make changes to this bug.