Bug 1881319
Summary: | [3.11] service cannot be working for a while if the pod is running before the service is created | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | zhaozhanqi <zzhao> |
Component: | Networking | Assignee: | Surya Seetharaman <surya> |
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | aconstan, bbennett, jdesousa, obockows, surya, yapei |
Version: | 3.11.0 | ||
Target Milestone: | --- | ||
Target Release: | 3.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-01-20 16:52:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
zhaozhanqi
2020-09-22 07:46:43 UTC
ok so was able to reproduce this on a quicklab cluster: quicklab@master-1 ~]$ oc version oc v3.11.286 kubernetes v1.11.0+d4cacc0 openshift v3.11.286 kubernetes v1.11.0+d4cacc0 [quicklab@master-1 ~]$ oc new-project z6 [quicklab@master-1 ~]$ oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json -n z6 pod/caddy-docker created [quicklab@master-1 ~]$ oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/unsecure/service_unsecure.json -n z6 service/service-unsecure created [quicklab@master-1 ~]$ curl 172.30.94.83:27017 curl: (7) Failed connect to 172.30.94.83:27017; Connection refused [quicklab@master-1 ~]$ oc describe svc service-unsecure Name: service-unsecure Namespace: z6 Labels: name=service-unsecure Annotations: <none> Selector: name=caddy-docker Type: ClusterIP IP: 172.30.94.83 Port: http 27017/TCP TargetPort: 8080/TCP Endpoints: 10.128.2.4:8080 Session Affinity: None Events: <none> [quicklab@master-1 ~]$ curl 10.128.2.4:8080 Hello-OpenShift-1 http-8080 [quicklab@master-1 ~]$ curl 172.30.94.83:27017 curl: (7) Failed connect to 172.30.94.83:27017; Connection refused [root@infra-2 origin]# iptables-save | grep z6 -A KUBE-SERVICES -d 172.30.94.83/32 -p tcp -m comment --comment "z6/service-unsecure:http has no endpoints" -m tcp --dport 27017 -j REJECT --reject-with icmp-port-unreachable SDN logs: I1022 19:59:40.554923 20011 proxier.go:369] userspace proxy: processing 0 service events I1022 19:59:40.554958 20011 proxier.go:348] userspace syncProxyRules took 32.998776ms I1022 19:59:46.696021 20011 proxy.go:353] sdn proxy: add svc z6/service-unsecure: &{{ } {service-unsecure z6 /api/v1/namespaces/z6/services/service-unsecure 2288fc19-14a1-11eb-8303-fa163e3ff84c 259062 0 2020-10-22 19:59:46 +0000 UTC <nil> <nil> map[name:service-unsecure] map[] [] nil [] } {ClusterIP [{http TCP 27017 {0 8080 } 0}] map[name:caddy-docker] 172.30.94.83 [] None <nil> [] 0 false} {{[]}}} I1022 19:59:46.696894 20011 service.go:319] Adding new service port "z6/service-unsecure:http" at 172.30.94.83:27017/TCP I1022 19:59:46.784983 20011 roundrobin.go:276] LoadBalancerRR: Setting endpoints for z6/service-unsecure:http to [10.128.2.4:8080] After about ~8mins the correct rules were created. [root@infra-2 origin]# iptables-save | grep z6 -A KUBE-SEP-S3L2DEEHCSAW76OY -s 10.128.2.4/32 -m comment --comment "z6/service-unsecure:http" -j KUBE-MARK-MASQ -A KUBE-SEP-S3L2DEEHCSAW76OY -p tcp -m comment --comment "z6/service-unsecure:http" -m tcp -j DNAT --to-destination 10.128.2.4:8080 -A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.94.83/32 -p tcp -m comment --comment "z6/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 172.30.94.83/32 -p tcp -m comment --comment "z6/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-SVC-DVQDYDX75OQJM2LY -A KUBE-SVC-DVQDYDX75OQJM2LY -m comment --comment "z6/service-unsecure:http" -j KUBE-SEP-S3L2DEEHCSAW76OY okay, so clearly the REJECT rule is added because kube-proxy is considering the case for the svc without an ep/backing pod. Somewhere the signal for when an endpoint exists is fuzzy. However this bug appears to be more of a race-condition. Because the second time I tried it, the correct rules were installed [quicklab@master-1 ~]$ oc delete project z6 project.project.openshift.io "z6" deleted [quicklab@master-1 ~]$ oc new-projext z6 [quicklab@master-1 ~]$ oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/caddy-docker.json -n z6 pod/caddy-docker created [quicklab@master-1 ~]$ [quicklab@master-1 ~]$ oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/unsecure/service_unsecure.json -n z6 service/service-unsecure created [quicklab@master-1 ~]$ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service-unsecure ClusterIP 172.30.146.113 <none> 27017/TCP 44s [quicklab@master-1 ~]$ curl 172.30.146.113:27017 Hello-OpenShift-1 http-8080 SDN logs: I1022 19:47:04.432998 20011 proxy.go:353] sdn proxy: add svc z5/service-unsecure: &{{ } {service-unsecure z5 /api/v1/namespaces/z5/services/service-unsecure 5c2d8221-149f-11eb-8303-fa163e3ff84c 256937 0 2020-10-22 19:47:04 +0000 UTC <nil> <nil> map[name:service-unsecure] map[] [] nil [] } {ClusterIP [{http TCP 27017 {0 8080 } 0}] map[name:caddy-docker] 172.30.177.231 [] None <nil> [] 0 false} {{[]}}} I1022 19:47:04.433508 20011 service.go:319] Adding new service port "z5/service-unsecure:http" at 172.30.177.231:27017/TCP I1022 19:47:04.529429 20011 proxier.go:369] userspace proxy: processing 0 service events I1022 19:47:04.529467 20011 proxier.go:348] userspace syncProxyRules took 21.21411ms I1022 19:47:04.537124 20011 roundrobin.go:276] LoadBalancerRR: Setting endpoints for z5/service-unsecure:http to [10.128.2.3:8080] I1022 19:47:04.537299 20011 roundrobin.go:240] Delete endpoint 10.128.2.3:8080 for service "z5/service-unsecure:http" [root@infra-2 origin]# iptables-save | grep z6 -A KUBE-SEP-33HRY4FFZEYPMEW2 -s 10.128.2.5/32 -m comment --comment "z6/service-unsecure:http" -j KUBE-MARK-MASQ -A KUBE-SEP-33HRY4FFZEYPMEW2 -p tcp -m comment --comment "z6/service-unsecure:http" -m tcp -j DNAT --to-destination 10.128.2.5:8080 -A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.146.113/32 -p tcp -m comment --comment "z6/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 172.30.146.113/32 -p tcp -m comment --comment "z6/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-SVC-DVQDYDX75OQJM2LY -A KUBE-SVC-DVQDYDX75OQJM2LY -m comment --comment "z6/service-unsecure:http" -j KUBE-SEP-33HRY4FFZEYPMEW2 So when looking at the logs, it is visible that the endpoint is first set and then deleted for the correct case whereas in the wrong scenario where the REJECT rule is installed, the "Delete endpoint" log line isn't visible. Need to look into the code more to find out why this is happening. (In reply to Surya Seetharaman from comment #4) > > After about ~8mins the correct rules were created. > > okay, so clearly the REJECT rule is added because kube-proxy is considering > the case for the svc without an ep/backing pod. Somewhere the signal for > when an endpoint exists is fuzzy. > > However this bug appears to be more of a race-condition. Because the second > time I tried it, the correct rules were installed. > Not true, I take back my words. This is not a race condition but actually a timing issue. SO for instance in some cases it takes 8mins for the correct rules to get updated while in some cases within seconds the REJECT rule gets converted to the correct rules. However the REJECT rule is always added first even if there is a pod for the service already. But once the endpoint is set, the correct rules should get saved, but this doesn't often happen. /o\ > > > So when looking at the logs, it is visible that the endpoint is first set > and then deleted for the correct case whereas in the wrong scenario where > the REJECT rule is installed, the "Delete endpoint" log line isn't visible. > Need to look into the code more to find out why this is happening. This is just the updateAffinityMap stuff that happens. Not related. ok so this is interesting: In fact on some sdn pods correct rules appear immediately while on some its still the REJECT rule: [quicklab@master-1 ~]$ oc exec -ti -n openshift-sdn sdn-2cd8n /bin/bash [root@master-2 origin]# iptables-save | grep z5 -A KUBE-SERVICES -d 172.30.71.115/32 -p tcp -m comment --comment "z5/service-unsecure:http has no endpoints" -m tcp --dport 27017 -j REJECT --reject-with icmp-port-unreachable [root@master-2 origin]# curl 172.30.71.115:27017 curl: (7) Failed connect to 172.30.71.115:27017; Connection refused [quicklab@master-1 ~]$ oc exec -ti -n openshift-sdn sdn-7nwhr /bin/bash root@master-0 origin]# iptables-save | grep z5 -A KUBE-SEP-HWZGWL7RJUA5PWOL -s 10.128.2.7/32 -m comment --comment "z5/service-unsecure:http" -j KUBE-MARK-MASQ -A KUBE-SEP-HWZGWL7RJUA5PWOL -p tcp -m comment --comment "z5/service-unsecure:http" -m tcp -j DNAT --to-destination 10.128.2.7:8080 -A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.71.115/32 -p tcp -m comment --comment "z5/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 172.30.71.115/32 -p tcp -m comment --comment "z5/service-unsecure:http cluster IP" -m tcp --dport 27017 -j KUBE-SVC-EVP4CM6LNBXKGCJM -A KUBE-SVC-EVP4CM6LNBXKGCJM -m comment --comment "z5/service-unsecure:http" -j KUBE-SEP-HWZGWL7RJUA5PWOL New updates: Status: on some sdn/ovs pods, the REJECT rule gets added because the endpointsMap is not populated correctly. There are three categories of behaviour and the sequence of events happening here are: 1) Type I: it takes 8mins approx for the rule to appear correctly, until then it is reject. I1023 15:46:33.656981 106483 proxy.go:353] sdn proxy: add svc z5/service-unsecure: &{{ } {service-unsecure z5 /api/v1/namespaces/z5/services/service-unsecure ed287840-1546-11eb-af2e-fa163e3ff84c 450857 0 2020-10-23 15:46:33 +0000 UTC <nil> <nil> map[name:service-unsecure] map[] [] nil [] } {ClusterIP [{http TCP 27017 {0 8080 } 0}] map[name:caddy-docker] 172.30.85.244 [] None <nil> [] 0 false} {{[]}}} I1023 15:46:33.657998 106483 service.go:319] Adding new service port "z5/service-unsecure:http" at 172.30.85.244:27017/TCP I1023 15:46:33.658252 106483 proxier.go:644] tssurya Syncing iptables rules I1023 15:46:33.792704 106483 roundrobin.go:252] tssurya is inside buildPortsToEndpointsMap I1023 15:46:33.792768 106483 roundrobin.go:270] tssurya is inside OnEndpointsAdd map[http:[{10.128.2.15 8080}]] I1023 15:46:33.792978 106483 roundrobin.go:279] tssurya state is &{[] 0 { map[] 10800}} and exists is true I1023 15:46:33.793105 106483 roundrobin.go:282] LoadBalancerRR: Setting endpoints for z5/service-unsecure:http to [10.128.2.15:8080] I1023 15:46:33.793145 106483 roundrobin.go:243] Delete endpoint 10.128.2.15:8080 for service "z5/service-unsecure:http" I1023 15:46:33.793178 106483 roundrobin.go:97] tssurya ttlSeconds 10800 I1023 15:46:33.793186 106483 roundrobin.go:106] tssurya returning from newServiceInternal &{[] 0 { map[] 10800}} I1023 15:46:33.828067 106483 proxier.go:763] tssurya print endpointsMap: [] I1023 15:46:33.828162 106483 proxier.go:810] tssurya service z5/service-unsecure:http has no endpoints :(! On the next run of sunc proxy, the rules get updated: I1023 15:58:17.244061 106483 proxier.go:763] tssurya print endpointsMap: [10.128.2.15:8080] I1023 15:58:17.244078 106483 proxier.go:790] tssurya service z5/service-unsecure:http has endpoints for approx ~8mins there is connection refused. 2) Type II: it takes a few milli-seconds for the rule to appear correctly correctly: I1023 15:46:33.655399 100474 proxy.go:353] sdn proxy: add svc z5/service-unsecure: &{{ } {service-unsecure z5 /api/v1/namespaces/z5/services/service-unsecure ed287840-1546-11eb-af2e-fa163e3ff84c 450857 0 2020-10-23 15:46:33 +0000 UTC <nil> <nil> map[name:service-unsecure] map[] [] nil [] } {ClusterIP [{http TCP 27017 {0 8080 } 0}] map[name:caddy-docker] 172.30.85.244 [] None <nil> [] 0 false} {{[]}}} I1023 15:46:33.656185 100474 service.go:319] Adding new service port "z5/service-unsecure:http" at 172.30.85.244:27017/TCP I1023 15:46:33.706806 100474 proxier.go:763] tssurya print endpointsMap: [] I1023 15:46:33.706900 100474 proxier.go:810] tssurya service z5/service-unsecure:http has no endpoints :(! I1023 15:46:33.797681 100474 roundrobin.go:252] tssurya is inside buildPortsToEndpointsMap I1023 15:46:33.798271 100474 roundrobin.go:270] tssurya is inside OnEndpointsAdd map[http:[{10.128.2.15 8080}]] I1023 15:46:33.798611 100474 roundrobin.go:279] tssurya state is &{[] 0 { map[] 10800}} and exists is true I1023 15:46:33.799141 100474 roundrobin.go:282] LoadBalancerRR: Setting endpoints for z5/service-unsecure:http to [10.128.2.15:8080] I1023 15:46:33.799348 100474 roundrobin.go:243] Delete endpoint 10.128.2.15:8080 for service "z5/service-unsecure:http" I1023 15:46:33.799506 100474 roundrobin.go:97] tssurya ttlSeconds 10800 I1023 15:46:33.799623 100474 roundrobin.go:106] tssurya returning from newServiceInternal &{[] 0 { map[] 10800}} I1023 15:46:33.861359 100474 proxier.go:369] userspace proxy: processing 0 service events I1023 15:46:33.861403 100474 proxier.go:348] userspace syncProxyRules took 138.139971ms I1023 15:46:33.861520 100474 proxier.go:644] tssurya Syncing iptables rules I1023 15:46:33.990342 100474 proxier.go:763] tssurya print endpointsMap: [10.128.2.15:8080] I1023 15:46:33.990472 100474 proxier.go:790] tssurya service z5/service-unsecure:http has endpoints so even if the reject rule is added first, it gets immediately replaced by the correct rules. 3) Type III: correct rules get added: I1023 15:46:33.894978 99723 roundrobin.go:252] tssurya is inside buildPortsToEndpointsMap I1023 15:46:33.895036 99723 roundrobin.go:270] tssurya is inside OnEndpointsAdd map[http:[{10.128.2.15 8080}]] I1023 15:46:33.895372 99723 roundrobin.go:279] tssurya state is &{[] 0 { map[] 10800}} and exists is true I1023 15:46:33.895488 99723 roundrobin.go:282] LoadBalancerRR: Setting endpoints for z5/service-unsecure:http to [10.128.2.15:8080] I1023 15:46:33.895532 99723 roundrobin.go:243] Delete endpoint 10.128.2.15:8080 for service "z5/service-unsecure:http" I1023 15:46:33.895568 99723 roundrobin.go:97] tssurya ttlSeconds 10800 I1023 15:46:33.895609 99723 roundrobin.go:106] tssurya returning from newServiceInternal &{[] 0 { map[] 10800}} I1023 15:46:33.895902 99723 proxier.go:644] tssurya Syncing iptables rules I1023 15:46:33.896169 99723 proxy.go:353] sdn proxy: add svc z5/service-unsecure: &{{ } {service-unsecure z5 /api/v1/namespaces/z5/services/service-unsecure ed287840-1546-11eb-af2e-fa163e3ff84c 450857 0 2020-10-23 15:46:33 +0000 UTC <nil> <nil> map[name:service-unsecure] map[] [] nil [] } {ClusterIP [{http TCP 27017 {0 8080 } 0}] map[name:caddy-docker] 172.30.85.244 [] None <nil> [] 0 false} {{[]}}} I1023 15:46:34.036885 99723 proxier.go:369] userspace proxy: processing 0 service events I1023 15:46:34.036910 99723 proxier.go:348] userspace syncProxyRules took 17.657405ms I1023 15:46:35.811948 99723 service.go:319] Adding new service port "z5/service-unsecure:http" at 172.30.85.244:27017/TCP I1023 15:46:35.812025 99723 proxier.go:644] tssurya Syncing iptables rules I1023 15:46:35.847119 99723 proxier.go:763] tssurya print endpointsMap: [10.128.2.15:8080] I1023 15:46:35.847143 99723 proxier.go:790] tssurya service z5/service-unsecure:http has endpoints All right, so I think I found the root cause for this race condition: Got introduced in in September through this PR (https://github.com/openshift/origin/commit/0c9e44a08d12c115c72900af6c7513a7d178ea2b) and exists only in the newer versions of 3.11. That's why I couldn't reproduce this bug in older 3.11 versions and/or in 4.x. The variable proxier.changesPending (https://github.com/openshift/origin/blob/56691f4e623af6d9f7c659cf41d71a2a28fb0ed6/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go#L185) that was introduced in the patch is not locked properly and hence it is not getting set to true on this line correctly: https://github.com/openshift/origin/blob/56691f4e623af6d9f7c659cf41d71a2a28fb0ed6/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go#L505 when doing an EndpointsUpdate. Due to this on entering SyncProxyRules function we get returned without running the sync because both force and changesPending variables are set to "false aaaaand false" (https://github.com/openshift/origin/blob/56691f4e623af6d9f7c659cf41d71a2a28fb0ed6/vendor/k8s.io/kubernetes/pkg/proxy/iptables/proxier.go#L618-L624). So until we get another OnEndpointsUpdate where the changesPending is set to true, the sync won't run and we are blocked. The REJECT rule won't be changed and upto several minutes ~ 10mins we are stuck. Fix should be easy, add the proxier.mu.Lock before the changing the variable. Verified this bug on v3.11.346 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 3.11.374 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0079 |