Created attachment 1522468 [details] BAD node log I have found numerous clusters where the kube-proxy just ignores services. 1 time all 3 masters were completely ignoring the openshift-api service. No iptables rules were created for the service. This caused the install to fail. Another time it was a different service and it was only 1/6 of the nodes. $ oc get service -n openshift-marketplace redhat-operators oc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redhat-operators ClusterIP 172.30.220.173 <none> 50051/TCP 57m $ oc get endpoints -n openshift-marketplace redhat-operators NAME ENDPOINTS AGE redhat-operators 10.131.0.7:50051 57m On a GOOD node: # iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)' -A KUBE-SEP-P65J3XIIYVKZSJY3 -s 10.131.0.7/32 -m comment --comment "openshift-marketplace/redhat-operators:grpc" -j KUBE-MARK-MASQ -A KUBE-SEP-P65J3XIIYVKZSJY3 -p tcp -m comment --comment "openshift-marketplace/redhat-operators:grpc" -m tcp -j DNAT --to-destination 10.131.0.7:50051 -A KUBE-SERVICES -d 172.30.220.173/32 -p tcp -m comment --comment "openshift-marketplace/redhat-operators:grpc cluster IP" -m tcp --dport 50051 -j KUBE-SVC-SGDZNVXMHJCPEAE2 On the BAD node: # iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)' [nothing]
Created attachment 1522469 [details] GOOD node log
relevant logs: I0122 15:47:10.365571 5902 vnids.go:148] Associate netid 11809244 to namespace "openshift-marketplace" with mcEnabled false ... E0122 15:48:01.633045 5902 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=69, ErrCode=NO_ERROR, debug="" E0122 15:48:01.633646 5902 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=69, ErrCode=NO_ERROR, debug="" etc etc etc ... E0122 15:48:01.659629 5902 reflector.go:253] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:129: Failed to watch *core.Endpoints: Get https://eparis3-api.devcluster.openshift.com:6443/api/v1/endpoints?resourceVersion=9416&timeoutSeconds=496&watch=true: dial tcp 10.0.148.179:6443: connect: connection refused ... W0122 15:49:05.099259 5902 proxy.go:306] Got OnEndpointsUpdate for unknown Endpoints &core.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"redhat-operators", GenerateName:"", Namespace:"openshift-marketplace", ... So, the apiserver fell over while the service was being created, and in the process of reconnecting, we missed the OnEndpointsAdded() event for the service. The "Got OnEndpointsUpdate for unknown Endpoints" warning comes from the SDN proxy filter code, but it actually deals fine with this case (treating it as an Add). So it's one of the two underlying proxies (unidler or iptables) that's screwing up here (among other things, by never resyncing). But really, it's mostly "when the apiserver falls over, bad things happen".
> So, the apiserver fell over rather, we transitioned from the bootstrap apiserver to the real one. From another cluster: I0123 19:15:07.678996 5849 roundrobin.go:240] Delete endpoint 10.0.3.83:6443 for service "default/kubernetes:https" E0123 19:15:08.053883 5849 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=73, ErrCode=NO_ERROR, debug="" Not clear why we're failing to fully re-sync
(In reply to Dan Winship from comment #5) > ... The "Got OnEndpointsUpdate for unknown Endpoints" warning comes > from the SDN proxy filter code, but it actually deals fine with this case > (treating it as an Add)... Does it? https://github.com/openshift/origin/blob/master/pkg/proxy/hybrid/proxy.go#L246 seems to suggest it "helpfully" drops the update on the floor.
Nevermind, that's a different and uninteresting message.
No, wait, I think that's it. We miss the OnServiceAdd, then get the OnEndpointsUpdate, which we drop on the floor because of line 246. We shouldn't do that. Really, we shouldn't have separate OnEndpointsUpdate and OnEndpointsAdd code in the hybrid proxy wrapper.
No, the warning that gets logged is from the SDN proxy filter, not from the hybrid proxier, and along with logging that message, it ends up treating the update as an add, so it's OnEndpointsAdded() that gets called in the hybrid proxier (which is why we never see the hybrid proxier's own "unknown endpoints" message).
(In reply to Eric Paris from comment #0) > On the BAD node: > # iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)' > [nothing] Oh, so kube-proxy isn't just missing the Endpoints, it's missing the Service too. (If it had seen the Service but not the Endpoints, there'd be a rule for 172.30.220.173 going to -j REJECT.) But the toplevel kube-proxy code does use Informers, and explicitly passes a resync interval... so we really ought to get informed about the missing Service *eventually* even if we missed the original notification. The openshift-sdn proxy filter doesn't work with Services at all (only Endpoints), and the hybrid proxier just passes through all Service events to the underlying iptables proxier (in the case where no idling is occurring, which I assume is the case here). And FWIW, the iptables proxier wouldn't care if it got the Endpoints and Service events in the wrong order. So I can't figure out how it is that we *never* end up finding out about the missing Service...
https://github.com/openshift/origin/pull/21851 is a debugging pull (you may want to gate it), that wires up a printer directly to the informer's event handling for services and endpoints. If you run and see a glog message about the missing service/endpoints, then the hybrid proxy is receiving it and losing it somewhere. If it doesn't show up in glog, then there's an issue with the informer and/or reflector and/or apiserver. It's heavy, but it's a simple way to be sure.
Marking as dupe of bug 1689690 since I'm actively working on that one. The summary is "userspace proxy sucks". *** This bug has been marked as a duplicate of bug 1689690 ***