Bug 1668414 - kube-proxy sometimes completely ignores services
Summary: kube-proxy sometimes completely ignores services
Keywords:
Status: CLOSED DUPLICATE of bug 1689690
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.1.0
Assignee: Dan Williams
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-22 16:55 UTC by Eric Paris
Modified: 2019-03-28 21:04 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-28 21:04:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
BAD node log (55.41 KB, text/plain)
2019-01-22 16:55 UTC, Eric Paris
no flags Details
GOOD node log (51.17 KB, text/plain)
2019-01-22 16:56 UTC, Eric Paris
no flags Details

Description Eric Paris 2019-01-22 16:55:41 UTC
Created attachment 1522468 [details]
BAD node log

I have found numerous clusters where the kube-proxy just ignores services. 1 time all 3 masters were completely ignoring the openshift-api service. No iptables rules were created for the service. This caused the install to fail. Another time it was a different service and it was only 1/6 of the nodes.

$ oc get service -n openshift-marketplace redhat-operators
oc NAME               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
redhat-operators   ClusterIP   172.30.220.173   <none>        50051/TCP   57m

$ oc get endpoints -n openshift-marketplace redhat-operators
NAME               ENDPOINTS          AGE
redhat-operators   10.131.0.7:50051   57m

On a GOOD node:
# iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)'
-A KUBE-SEP-P65J3XIIYVKZSJY3 -s 10.131.0.7/32 -m comment --comment "openshift-marketplace/redhat-operators:grpc" -j KUBE-MARK-MASQ
-A KUBE-SEP-P65J3XIIYVKZSJY3 -p tcp -m comment --comment "openshift-marketplace/redhat-operators:grpc" -m tcp -j DNAT --to-destination 10.131.0.7:50051
-A KUBE-SERVICES -d 172.30.220.173/32 -p tcp -m comment --comment "openshift-marketplace/redhat-operators:grpc cluster IP" -m tcp --dport 50051 -j KUBE-SVC-SGDZNVXMHJCPEAE2

On the BAD node:
# iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)'
[nothing]

Comment 1 Eric Paris 2019-01-22 16:56:25 UTC
Created attachment 1522469 [details]
GOOD node log

Comment 5 Dan Winship 2019-01-23 16:48:23 UTC
relevant logs:

I0122 15:47:10.365571    5902 vnids.go:148] Associate netid 11809244 to namespace "openshift-marketplace" with mcEnabled false
...
E0122 15:48:01.633045    5902 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=69, ErrCode=NO_ERROR, debug=""
E0122 15:48:01.633646    5902 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=69, ErrCode=NO_ERROR, debug=""
etc etc etc
...
E0122 15:48:01.659629    5902 reflector.go:253] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:129: Failed to watch *core.Endpoints: Get https://eparis3-api.devcluster.openshift.com:6443/api/v1/endpoints?resourceVersion=9416&timeoutSeconds=496&watch=true: dial tcp 10.0.148.179:6443: connect: connection refused
...
W0122 15:49:05.099259    5902 proxy.go:306] Got OnEndpointsUpdate for unknown Endpoints &core.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"redhat-operators", GenerateName:"", Namespace:"openshift-marketplace", ...


So, the apiserver fell over while the service was being created, and in the process of reconnecting, we missed the OnEndpointsAdded() event for the service. The "Got OnEndpointsUpdate for unknown Endpoints" warning comes from the SDN proxy filter code, but it actually deals fine with this case (treating it as an Add). So it's one of the two underlying proxies (unidler or iptables) that's screwing up here (among other things, by never resyncing).

But really, it's mostly "when the apiserver falls over, bad things happen".

Comment 6 Dan Winship 2019-01-23 21:36:26 UTC
> So, the apiserver fell over

rather, we transitioned from the bootstrap apiserver to the real one. From another cluster:

I0123 19:15:07.678996    5849 roundrobin.go:240] Delete endpoint 10.0.3.83:6443 for service "default/kubernetes:https"
E0123 19:15:08.053883    5849 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=73, ErrCode=NO_ERROR, debug=""

Not clear why we're failing to fully re-sync

Comment 7 Casey Callendrello 2019-01-24 11:36:46 UTC
(In reply to Dan Winship from comment #5)
> ... The "Got OnEndpointsUpdate for unknown Endpoints" warning comes
> from the SDN proxy filter code, but it actually deals fine with this case
> (treating it as an Add)...

Does it? https://github.com/openshift/origin/blob/master/pkg/proxy/hybrid/proxy.go#L246 seems to suggest it "helpfully" drops the update on the floor.

Comment 8 Casey Callendrello 2019-01-24 11:42:01 UTC
Nevermind, that's a different and uninteresting message.

Comment 9 Casey Callendrello 2019-01-24 11:51:35 UTC
No, wait, I think that's it. We miss the OnServiceAdd, then get the OnEndpointsUpdate, which we drop on the floor because of line 246. We shouldn't do that.

Really, we shouldn't have separate OnEndpointsUpdate and OnEndpointsAdd code in the hybrid proxy wrapper.

Comment 10 Dan Winship 2019-01-24 11:54:42 UTC
No, the warning that gets logged is from the SDN proxy filter, not from the hybrid proxier, and along with logging that message, it ends up treating the update as an add, so it's OnEndpointsAdded() that gets called in the hybrid proxier (which is why we never see the hybrid proxier's own "unknown endpoints" message).

Comment 11 Dan Winship 2019-01-24 20:16:58 UTC
(In reply to Eric Paris from comment #0)
> On the BAD node:
> # iptables-save | grep -E '(172.30.220.173)|(10.131.0.7)'
> [nothing]

Oh, so kube-proxy isn't just missing the Endpoints, it's missing the Service too. (If it had seen the Service but not the Endpoints, there'd be a rule for 172.30.220.173 going to -j REJECT.)

But the toplevel kube-proxy code does use Informers, and explicitly passes a resync interval... so we really ought to get informed about the missing Service *eventually* even if we missed the original notification. The openshift-sdn proxy filter doesn't work with Services at all (only Endpoints), and the hybrid proxier just passes through all Service events to the underlying iptables proxier (in the case where no idling is occurring, which I assume is the case here). And FWIW, the iptables proxier wouldn't care if it got the Endpoints and Service events in the wrong order. So I can't figure out how it is that we *never* end up finding out about the missing Service...

Comment 12 David Eads 2019-01-24 21:01:47 UTC
https://github.com/openshift/origin/pull/21851 is a debugging pull (you may want to gate it), that wires up a printer directly to the informer's event handling for services and endpoints.  If you run and see a glog message about the missing service/endpoints, then the hybrid proxy is receiving it and losing it somewhere.  If it doesn't show up in glog, then there's an issue with the informer and/or reflector and/or apiserver.  It's heavy, but it's a simple way to be sure.

Comment 14 Dan Williams 2019-03-28 21:04:33 UTC
Marking as dupe of bug 1689690 since I'm actively working on that one. The summary is "userspace proxy sucks".

*** This bug has been marked as a duplicate of bug 1689690 ***


Note You need to log in before you can comment on or make changes to this bug.