Hide Forgot
Description of problem: Create a route with multiple backends, then set the weight for the backends, the weight it is not accurate when access the route Version-Release number of selected component (if applicable): dev-preview-stg atomic-openshift-3.3.0.33-1.git.0.8601ee7.el7.x86_64 docker-1.10.3-46.el7.14.x86_64 kernel-3.10.0-327.36.1.el7.x86_64 How reproducible: Sometimes How reproducible: always Steps to Reproduce: 1. Create pods/services: PodA & serviceA # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/caddy-docker.json # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/unseucre/service_unsecure.json PodB & serviceB # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/caddy-docker-2.json # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/unseucre/service_unsecure-2.json PodC & serviceC # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/caddy-docker-3.json # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/abrouting/unseucre/service_unsecure-3.json 3. Create unsecure route # oc expose svc service-unsecure --name=unsecure-route 4. Set the route to roundrobin mode # oc annotate route unsecure-route --overwrite haproxy.router.openshift.io/balance=roundrobin 5. Set backends weight for route # oc set route-backends unsecure-route service-unsecure=2 service-unsecure-2=3 service-unsecure-3=5 # oc set route-backends unsecure-route NAME KIND TO WEIGHT routes/unsecure-route Service service-unsecure 2 (20%) routes/unsecure-route Service service-unsecure-2 3 (30%) routes/unsecure-route Service service-unsecure-3 5 (50%) 6. Access the route 100 times #for i in {1..100}; do curl unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com >> a.log; done # cat a.log | grep 'Hello-OpenShift-3' | wc -l 52 # cat a.log | grep 'Hello-OpenShift-2' | wc -l 31 # cat a.log | grep 'Hello-OpenShift-1' | wc -l 17 7. Access the route again #for i in {1..100}; do curl unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com >> b.log; done # cat b.log | grep 'Hello-OpenShift-1' | wc -l 17 # cat b.log | grep 'Hello-OpenShift-2' | wc -l 32 # cat b.log | grep 'Hello-OpenShift-3' | wc -l 51 Actual results: Refer to step6 and step7 Expected results: Routing to backends should be same as the weight we set Additional info: Sometims issue could be reproduced with two backends too, but the reproducibility is higher when using multiple backends.
Those are awfully close to the weights both times... what's the problem?
Just tried same test steps in my env, get perfect results as expected: [root@dhcp-41-178 ~]# for i in {1..100}; do curl --resolve unsecure-route-https.router.default.svc.cluster.local:80:10.18.41.181 http://unsecure-route-https.router.default.svc.cluster.local>> a.log; done [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-1' | wc -l 20 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-2' | wc -l 30 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-3' | wc -l 50 [root@dhcp-41-178 ~]# for i in {1..100}; do curl --resolve unsecure-route-https.router.default.svc.cluster.local:80:10.18.41.181 http://unsecure-route-https.router.default.svc.cluster.local>> a.log; done [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-1' | wc -l 40 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-2' | wc -l 60 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-3' | wc -l 100 [root@dhcp-41-178 ~]# for i in {1..100}; do curl --resolve unsecure-route-https.router.default.svc.cluster.local:80:10.18.41.181 http://unsecure-route-https.router.default.svc.cluster.local>> a.log; done [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-1' | wc -l 60 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-2' | wc -l 90 [root@dhcp-41-178 ~]# cat a.log | grep 'Hello-OpenShift-3' | wc -l 150 [root@dhcp-41-178 ~]#
Thanks Weibin. I don't see a bug here.
I still met the issue in latest dev-preview-stg env (3.3.1.3), it is more easier to reproduce the issue on stg env than ose env [root@yanshost jsonfile]# oc set route-backends unsecure-route NAME KIND TO WEIGHT routes/unsecure-route Service service-unsecure 20 (20%) routes/unsecure-route Service service-unsecure-2 80 (80%) [root@yanshost jsonfile]# for i in {1..10}; do curl http://unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com ; done Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 [root@yanshost jsonfile]# for i in {1..10}; do curl http://unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com ; done Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080 [root@yanshost jsonfile]# for i in {1..10}; do curl http://unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com ; done Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 [root@yanshost jsonfile]# for i in {1..10}; do curl http://unsecure-route-d1.b795.dev-preview-stg.openshiftapps.com ; done Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080 Hello-OpenShift-2 http-8080 Hello-OpenShift-1 http-8080
For two backends, if I send 100 requests, sometimes I got below result [root@yanshost jsonfile]# oc set route-backends unsecure-route NAME KIND TO WEIGHT routes/unsecure-route Service service-unsecure 20 (20%) routes/unsecure-route Service service-unsecure-2 80 (80% [root@yanshost jsonfile]# cat a.log | grep 'Hello-OpenShift-1' | wc -l 18 [root@yanshost jsonfile]# cat a.log | grep 'Hello-OpenShift-2' | wc -l 82 For multiple backends: [root@yanshost jsonfile]# oc set route-backends unsecure-route NAME KIND TO WEIGHT routes/unsecure-route Service service-unsecure 2 (20%) routes/unsecure-route Service service-unsecure-2 3 (30%) routes/unsecure-route Service service-unsecure-3 5 (50%) [root@yanshost jsonfile]# cat d.log | grep 'Hello-OpenShift-3' | wc -l 52 [root@yanshost jsonfile]# cat d.log | grep 'Hello-OpenShift-2' | wc -l 30 [root@yanshost jsonfile]# cat d.log | grep 'Hello-OpenShift-1' | wc -l 18 btw, it is closed to the weight we set but not accurate, and the bug is not 100% reproducibility, feel free to contact me if you can't reproduce it. Thanks
Yan: Can you pull the stats from the router to see what they show for the proportions? I also contend that the weights won't be perfect, but they should be within a few percent, as we see. But I'll look into that after we get the stats.
Doesn't the stage env have multiple routers? That may explain the slight discrepancies. Checking the stats would be the right way.
The logs from this particular router do suggest 20%/80% exact split. It is possible that multiple routers do cause a slight variation (the weighted roundrobin resets its counters on a reload). I would consider this bug NOT_A_BUG To be sure, please include stats logs from all the routers.