Bug 1378392

Summary: Cannot wake up the service when using F5 router
Product: OpenShift Container Platform Reporter: Hongan Li <hongli>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Networking sub component: router QA Contact: zhaozhanqi <zzhao>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: unspecified CC: aos-bugs, bmeng
Version: 3.3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-28 16:49:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Hongan Li 2016-09-22 10:33:20 UTC
Description of problem:
Cannot wake up the service when using F5 router

Version-Release number of selected component (if applicable):
oc v3.3.0.32
kubernetes v1.3.0+52492b4


How reproducible:
always

Steps to Reproduce:
1. Create F5 router
2. Create test-rc
#oc create -f  https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/list_for_pods.json
3. #oc expose svc service-unsecure
4. make sure the route can be accessed via F5
#curl --resolve service-unsecure-u1p1.router.default.svc.cluster.local:80:52.6.227.66 http://service-unsecure-u1p1.router.default.svc.cluster.local
5. #oc idle service-unsecure
6. Access the route again via F5

Actual results:
Cannot wake up the service when route coming

Expected results:
Should wake up the service when route coming

Additional info:
#oc logs f5router-1-mmhhp
E0922 06:20:57.307871       1 controller.go:123] Encountered an error on DELETE request to URL https://10.3.88.53/mgmt/tm/ltm/pool/openshift_u1p1_service-unsecure: HTTP code: 400; error from F5: 01070265:3: The Pool (/Common/openshift_u1p1_service-unsecure) cannot be deleted because it is in use by a policy action (/Common/openshift_insecure_routes openshift_route_u1p1_service-unsecure 0).
E0922 06:21:22.482973       1 controller.go:123] Encountered an error on DELETE request to URL https://10.3.88.53/mgmt/tm/ltm/pool/openshift_u1p1_service-unsecure: HTTP code: 400; error from F5: 01070265:3: The Pool (/Common/openshift_u1p1_service-unsecure) cannot be deleted because it is in use by a policy action (/Common/openshift_insecure_routes openshift_route_u1p1_service-unsecure 0).

Comment 1 Hongan Li 2016-09-23 07:41:21 UTC
workaround to unidle the service:

since "curl service:port" also can idle the service, so you can run oc command as below:

# oc get svc -n u1p1
NAME               CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
service-secure     172.30.239.190   <none>        27443/TCP   21h
service-unsecure   172.30.69.154    <none>        27017/TCP   21h

# curl 172.30.69.154:27017
Hello-OpenShift-1 http-8080

Then access the route via F5 is OK.

Comment 2 Ben Bennett 2016-09-23 14:07:37 UTC
hongli: Thanks.  You can also scale it up manually.

The F5 is not supported for unidling at the moment, I'll see if we can add it easily; but we missed it in the docs.

Comment 3 Ben Bennett 2016-09-23 14:11:22 UTC
To add more information, we had initially planned to implement unidling only in haproxy.  The problem with the F5 unidling is that the F5 would need to access the service IP address, and that may be tricky with the ramp node.  We are working on redoing the way the F5 talks to our network, to get rid of the ramp nodes.  We will add the unidling support to the F5 solution soon, but it should not block the 3.3 release since it was not intended to work there.