Bug 1694172

Summary: Client received empty watch event type during e2e test
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: MasterAssignee: Tomáš Nožička <tnozicka>
Status: CLOSED CURRENTRELEASE QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, jokerman, mmccomas
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-03 16:13:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clayton Coleman 2019-03-29 17:09:18 UTC
Watch event from server had an empty event type, which should never happen.  Might just be part of the test scenario, but we need to prove it isn't some sort of subtle bug lurking in kube (for instance, maybe on restart we send back invalid watch events).

fail [github.com/openshift/origin/test/extended/deployments/util.go:749]: Expected error:
    <*errors.errorString | 0xc420325410>: {
        s: "unexpected event: watch.Event{Type:\"\", Object:runtime.Object(nil)}",
    }
    unexpected event: watch.Event{Type:"", Object:runtime.Object(nil)}
not to have occurred

https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6281

Happening 1/15 or so runs.

Comment 2 Tomáš Nožička 2019-04-01 13:27:57 UTC
it was actually a watch closed prematurely, so the real cause was kube-apiserver restart during test

https://github.com/openshift/origin/pull/22440


I do wonder why the overall test for control plane stability invariant didn't go red...

The kube-apiserver was rolling out during e2e:

https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.0/6281/artifacts/e2e-aws/pods/openshift-kube-apiserver-operator_kube-apiserver-operator-7d7794457d-4588c_operator.log.gz

I0329 00:22:56.699922       1 installer_controller.go:314] "ip-10-0-151-80.ec2.internal" is in transition to 5, but has not made progress
I0329 00:23:00.510078       1 status_controller.go:156] clusteroperator/kube-apiserver diff {"status":{"conditions":[{"lastTransitionTime":"2019-03-29T00:20:20Z","reason":"AsExpected","status":"False","type":"Failing"},{"lastTransitionTime":"2019-03-29T00:13:55Z","message":"Progressing: 2 nodes are at revision 5; 1 nodes are at revision 2","reason":"Progressing","status":"True","type":"Progressing"},{"lastTransitionTime":"2019-03-29T00:14:35Z","message":"Available: 3 nodes are active; 2 nodes are at revision 5; 1 nodes are at revision 2","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2019-03-29T00:12:51Z","reason":"AsExpected","status":"True","type":"Upgradeable"}]}}
I0329 00:23:00.523831       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"c99e40a6-51b6-11e9-b977-126e0dd8934a", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for operator kube-apiserver changed: Failing message changed from "StaticPodsFailing: nodes/ip-10-0-151-80.ec2.internal pods/kube-apiserver-ip-10-0-151-80.ec2.internal container=\"kube-apiserver-5\" is not ready" to ""

Comment 4 Xingxing Xia 2019-04-03 08:45:20 UTC
Above https://github.com/openshift/origin/pull/22440 is still open

Comment 5 Tomáš Nožička 2019-04-03 16:13:00 UTC
merged