Hide Forgot
Created attachment 1210124 [details] events and routes Description of problem: ser reported http 503 when accessing their app on dev-preview public cluster, which appears to be working otherwise (no error while deploying, no errors in logs): http://windward-php-windward.44fs.preview.openshiftapps.com/ User's events and routes' description is attached. I was able to induce this by deploying the cakephp example app from source (https://github.com/openshift/cakephp-ex.git): http://cakephp-pub.44fs.preview.openshiftapps.com The app itself seems to be working just fine - when rshed into the pod, the content is delivered. When accessed by the route, it's returning 503. Aside of repeating warnings that the service is having trouble creating load balancer: ----- 18s 4h 116 cakephp Service Warning CreatingLoadBalancerFailed {service-controller } (events with common reason combined) ----- ...only the following warning was presented for the deploy pod: ----- 1h 1h 1 cakephp-2-deploy Pod Warning FailedSync {kubelet ip-172-31-8-217.ec2.internal} Error syncing pod, skipping: failed to "TeardownNetwork" for "cakephp-2-deploy_pub" with TeardownNetworkError: "Failed to teardown network for pod \"61c98211-9138-11e6-ba92-0e63b9c1c48f\" using network plugins \"redhat/openshift-ovs-multitenant\": Error running network teardown script: Could not find IP address for container a4b042079b8dfb79c860881e7ca78c1a077935ce80426d02810264942d6a727d" ----- Here are all recent events: ----- $ oc get events | grep cakephp 1h 1h 1 cakephp-1-93ox8 Pod spec.containers{cakephp} Normal Killing {kubelet ip-172-31-3-39.ec2.internal} Killing container with docker id 042e3178a31f: Need to kill pod. 1h 1h 1 cakephp-1 ReplicationController Normal SuccessfulDelete {replication-controller } Deleted pod: cakephp-1-93ox8 1h 1h 1 cakephp-2-845mq Pod Normal Scheduled {default-scheduler } Successfully assigned cakephp-2-845mq to ip-172-31-10-174.ec2.internal 1h 1h 1 cakephp-2-845mq Pod spec.containers{cakephp} Normal Pulling {kubelet ip-172-31-10-174.ec2.internal} pulling image "172.30.47.227:5000/pub/cakephp@sha256:c9da248b4bbe412e1a5be51fc708a80b8046d953172a567bb422240e71a4477a" 1h 1h 1 cakephp-2-845mq Pod spec.containers{cakephp} Normal Pulled {kubelet ip-172-31-10-174.ec2.internal} Successfully pulled image "172.30.47.227:5000/pub/cakephp@sha256:c9da248b4bbe412e1a5be51fc708a80b8046d953172a567bb422240e71a4477a" 1h 1h 1 cakephp-2-845mq Pod spec.containers{cakephp} Normal Created {kubelet ip-172-31-10-174.ec2.internal} Created container with docker id 2011f6b0b2ac 1h 1h 1 cakephp-2-845mq Pod spec.containers{cakephp} Normal Started {kubelet ip-172-31-10-174.ec2.internal} Started container with docker id 2011f6b0b2ac 40m 40m 1 cakephp-2-845mq Pod spec.containers{cakephp} Normal Killing {kubelet ip-172-31-10-174.ec2.internal} Killing container with docker id 2011f6b0b2ac: Need to kill pod. 40m 40m 1 cakephp-2-crktz Pod Normal Scheduled {default-scheduler } Successfully assigned cakephp-2-crktz to ip-172-31-8-230.ec2.internal 40m 40m 1 cakephp-2-crktz Pod spec.containers{cakephp} Normal Pulling {kubelet ip-172-31-8-230.ec2.internal} pulling image "172.30.47.227:5000/pub/cakephp@sha256:c9da248b4bbe412e1a5be51fc708a80b8046d953172a567bb422240e71a4477a" 40m 40m 1 cakephp-2-crktz Pod spec.containers{cakephp} Normal Pulled {kubelet ip-172-31-8-230.ec2.internal} Successfully pulled image "172.30.47.227:5000/pub/cakephp@sha256:c9da248b4bbe412e1a5be51fc708a80b8046d953172a567bb422240e71a4477a" 40m 40m 1 cakephp-2-crktz Pod spec.containers{cakephp} Normal Created {kubelet ip-172-31-8-230.ec2.internal} Created container with docker id 69bd880b993e 40m 40m 1 cakephp-2-crktz Pod spec.containers{cakephp} Normal Started {kubelet ip-172-31-8-230.ec2.internal} Started container with docker id 69bd880b993e 1h 1h 1 cakephp-2-deploy Pod Normal Scheduled {default-scheduler } Successfully assigned cakephp-2-deploy to ip-172-31-8-217.ec2.internal 1h 1h 1 cakephp-2-deploy Pod spec.containers{deployment} Normal Pulling {kubelet ip-172-31-8-217.ec2.internal} pulling image "registry.ops.openshift.com/openshift3/ose-deployer:v3.3.0.33" 1h 1h 1 cakephp-2-deploy Pod spec.containers{deployment} Normal Pulled {kubelet ip-172-31-8-217.ec2.internal} Successfully pulled image "registry.ops.openshift.com/openshift3/ose-deployer:v3.3.0.33" 1h 1h 1 cakephp-2-deploy Pod spec.containers{deployment} Normal Created {kubelet ip-172-31-8-217.ec2.internal} Created container with docker id 93142e66fe31 1h 1h 1 cakephp-2-deploy Pod spec.containers{deployment} Normal Started {kubelet ip-172-31-8-217.ec2.internal} Started container with docker id 9 3142e66fe31 1h 1h 1 cakephp-2-deploy Pod spec.containers{deployment} Normal Killing {kubelet ip-172-31-8-217.ec2.internal} Killing container with docker id 9 3142e66fe31: Need to kill pod. 1h 1h 1 cakephp-2-deploy Pod Warning FailedSync {kubelet ip-172-31-8-217.ec2.internal} Error syncing pod, skipping: faile d to "TeardownNetwork" for "cakephp-2-deploy_pub" with TeardownNetworkError: "Failed to teardown network for pod \"61c98211-9138-11e6-ba92-0e63b9c1c48f\" using network plugins \"redhat/openshift-ovs-multitenant\ ": Error running network teardown script: Could not find IP address for container a4b042079b8dfb79c860881e7ca78c1a077935ce80426d02810264942d6a727d" 1h 1h 1 cakephp-2 ReplicationController Normal SuccessfulCreate {replication-controller } Created pod: cakephp-2-845mq 40m 40m 1 cakephp-2 ReplicationController Normal SuccessfulCreate {replication-controller } Created pod: cakephp-2-crktz 18s 4h 116 cakephp Service Warning CreatingLoadBalancerFailed {service-controller } (events with common reason combined) 1h 1h 1 cakephp DeploymentConfig Normal DeploymentCreated {deploymentconfig-controller } Created new deployment "cakephp-2" for version 2 ----- Version-Release number of selected component (if applicable): OpenShift Master: v3.3.0.33 Kubernetes Master: v1.3.0+52492b4 How reproducible: always Steps to Reproduce: 1. Create pod, service and route 2. access the service via http route Actual results: http 503 Expected results: content delivery from the service Additional info: seems similar to bug 1372619
Can you get more logs for the service-controller? That error seems like the culprit. I would also be interested in seeing the output from: oc get svc cakephp -o yaml And: oc get ep cakephp Thanks
Could you please provide steps on how to get more logs for the service-controller? The CreatingLoadBalancerFailed warning occurred many times, so the messages are combined together: $ oc get events LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 26s 5h 182 cakephp Service Warning CreatingLoadBalancerFailed {service-controller } (events with common reason combined) 5s 5h 135 nodejsex Service Warning CreatingLoadBalancerFailed {service-controller } (events with common reason combined) $ oc get svc cakephp -o yaml apiVersion: v1 kind: Service metadata: annotations: openshift.io/generated-by: OpenShiftWebConsole creationTimestamp: 2016-10-13T07:41:35Z labels: app: cakephp name: cakephp namespace: pub resourceVersion: "233880834" selfLink: /api/v1/namespaces/pub/services/cakephp uid: 77b64a42-9118-11e6-ae04-0ebeb1070c7f spec: clusterIP: 172.30.23.123 portalIP: 172.30.23.123 ports: - name: 8080-tcp port: 8080 protocol: TCP targetPort: 8080 selector: deploymentconfig: cakephp sessionAffinity: None type: ClusterIP status: loadBalancer: {} $ oc get ep cakephp NAME ENDPOINTS AGE cakephp 10.1.60.2:8080 6h
Hello Ben, I have reported the issue originally. Here are my findings so far: Going directly to the pod works, going through service works as well, going through the route is broken. You can check my namespace on Dev Preview (tschloss) to see the problem, it's the EAP application. $ oc get route NAME HOST/PORT PATH SERVICE TERMINATION LABELS eap-app eap-app-tschloss.44fs.preview.openshiftapps.com eap-app:http app=eap-app,application=eap-app,template=eap70-mysql-persistent-s2i,xpaas=1.3.1 $ oc get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE database-mysql 172.30.166.247 <none> 3306/TCP 2h eap-app 172.30.84.68 <none> 8080/TCP 2h $ oc get ep NAME ENDPOINTS AGE database-mysql 10.1.60.6:3306 2h eap-app 10.1.87.2:8080 2h $ oc get svc eap-app -o yaml apiVersion: v1 kind: Service metadata: annotations: description: The web server's http and https ports. openshift.io/generated-by: OpenShiftNewApp creationTimestamp: 2016-10-13T12:18:19Z labels: app: eap-app application: eap-app template: eap70-mysql-persistent-s2i xpaas: 1.3.1 name: eap-app namespace: tschloss resourceVersion: "234438053" selfLink: /api/v1/namespaces/tschloss/services/eap-app uid: 20b5b31f-913f-11e6-ad3a-0e3d364e19a5 spec: clusterIP: 172.30.84.68 portalIP: 172.30.84.68 ports: - name: http port: 8080 protocol: TCP targetPort: 8080 selector: deploymentConfig: eap-app sessionAffinity: None type: ClusterIP status: loadBalancer: {} $ oc rsh database-mysql-1-ec8ue sh-4.2$ curl -s -o /dev/null -D - 10.1.87.2:8080 HTTP/1.1 200 OK Connection: keep-alive X-Powered-By: Undertow/1 Server: JBoss-EAP/7 Content-Type: text/html;charset=UTF-8 Content-Length: 2005 Date: Thu, 13 Oct 2016 15:08:10 GMT sh-4.2$ curl -s -o /dev/null -D - 172.30.84.68:8080 HTTP/1.1 200 OK Connection: keep-alive X-Powered-By: Undertow/1 Server: JBoss-EAP/7 Content-Type: text/html;charset=UTF-8 Content-Length: 2005 Date: Thu, 13 Oct 2016 15:08:32 GMT sh-4.2$ curl -s -o /dev/null -D - eap-app-tschloss.44fs.preview.openshiftapps.com HTTP/1.0 503 Service Unavailable Cache-Control: no-cache Connection: close Content-Type: text/html
What does: oc get route eap-app -o yaml Return?
$ oc get route eap-app -o yaml apiVersion: v1 kind: Route metadata: annotations: description: Route for application's http service. openshift.io/generated-by: OpenShiftNewApp openshift.io/host.generated: "true" creationTimestamp: 2016-10-13T12:18:20Z labels: app: eap-app application: eap-app template: eap70-mysql-persistent-s2i xpaas: 1.3.1 name: eap-app namespace: tschloss resourceVersion: "234438065" selfLink: /oapi/v1/namespaces/tschloss/routes/eap-app uid: 20ca8be4-913f-11e6-ad3a-0e3d364e19a5 spec: host: eap-app-tschloss.44fs.preview.openshiftapps.com port: targetPort: http to: kind: Service name: eap-app status: ingress: - conditions: - lastTransitionTime: 2016-10-13T12:18:20Z status: "True" type: Admitted host: eap-app-tschloss.44fs.preview.openshiftapps.com routerName: router
I can reproduce this issue in Online prod, and prod environment is extremely slow now. Error syncing pod, skipping: failed to "TeardownNetwork" for "cakephp-mysql-example-1-deploy_bingli4" with TeardownNetworkError: "Failed to teardown network for pod \"00e8e93e-91fd-11e6-ba92-0e63b9c1c48f\" using network plugins \"redhat/openshift-ovs-multitenant\": Error running network teardown script: Could not find IP address for container f5583130ec15e0143b9a9fe01204b955ea44ab928e8df9538a1c0874c931bb06"
➜ ~ oc get svc cake -o yaml apiVersion: v1 kind: Service metadata: annotations: openshift.io/generated-by: OpenShiftWebConsole creationTimestamp: 2016-10-14T11:06:26Z labels: app: cake name: cake namespace: nifty resourceVersion: "237197024" selfLink: /api/v1/namespaces/nifty/services/cake uid: 40352961-91fe-11e6-ba92-0e63b9c1c48f spec: clusterIP: 172.30.196.123 portalIP: 172.30.196.123 ports: - name: 8080-tcp port: 8080 protocol: TCP targetPort: 8080 selector: deploymentconfig: cake sessionAffinity: None type: ClusterIP status: loadBalancer: {} ➜ ~ oc get ep cake NAME ENDPOINTS AGE cake 10.1.96.6:8080 13m
It looks like the problem is that you have a targetPort of http (i.e. 80) specified in the route, but the service is on 8080.
1.But all my operation is the same before today, and it has been ok before. 2.80 and 8080 can't be change in my operation.
This target port will route to Service Port 8080 → Container Port 8080 (TCP).
11:07:24 PM Warning Creating load balancer failed Error creating load balancer (will retry): Error getting LB for service react/cake: AccessDenied: User: arn:aws:iam::507479335359:user/cloud_provider is not authorized to perform: elasticloadbalancing:DescribeLoadBalancers status code: 403, request id: e9b826bb-921f-11e6-aeea-75cc499f5b37
-> Cgroups memory limit is set, using HTTPD_MAX_REQUEST_WORKERS=34 AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.1.5.6. Set the 'ServerName' directive globally to suppress this message AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.1.5.6. Set the 'ServerName' directive globally to suppress this message [Fri Oct 14 10:54:24.207285 2016] [auth_digest:notice] [pid 1] AH01757: generating secret for digest authentication ... [Fri Oct 14 10:54:24.216426 2016] [http2:warn] [pid 1] AH02951: mod_ssl does not seem to be enabled [Fri Oct 14 10:54:24.217053 2016] [lbmethod_heartbeat:notice] [pid 1] AH02282: No slotmem from mod_heartmonitor [Fri Oct 14 10:54:24.396521 2016] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.18 (Red Hat) configured -- resuming normal operations [Fri Oct 14 10:54:24.396553 2016] [core:notice] [pid 1] AH00094: Command line: 'httpd -D FOREGROUND'
ljladmin: Ok, that "Creating load balancer" message is for creating a service of type loadbalancer. But those aren't supported in online, and the service above is not of that type. So that's probably unrelated. Tomas: Sorry, I was looking at the wrong service. I see that yours names a port http. So the route is fine. I think we need to look at the logs for the router to see what's going on.
ljladmin: Miciah points out your issues are the same as https://bugzilla.redhat.com/show_bug.cgi?id=1367229 But Tomas' are separate and I'll keep looking at those.
Miciah got the logs, and we see: W1014 04:24:08.510172 1 router.go:690] a edge terminated route with host cakephp-mysql-persistent-cakephp.44fs.preview.openshiftapps.com does not have the required certificates. The route will still be created but no certificates will be written Jiří Fiala: Can you post the yaml for your route please?
Oh, and Tomas, are your routes still present? There's nothing in the logs about eap-app.
Ben, I have recreated them a few times throughout the day to see if the issue is fixed. I wanted to use OSO in live demo. Some other time perhaps. Right now, I have two routes deployed in my namespace (eap-app and secure-eap-app).
Hi Guys Thank you for looking into this matter... I was the fellow who first reported the error. My OpenShift Preview site is essentially broken until this is resolved. Is there something I should be doing, or just sort of sit tight and be patient. I'm totally cool with that, just not sure of next steps. Thanks in advance for your help. - Googs
I am having exactly the same problem with my preview.openshift.com account. I've tried with a number of the demo applications and in each case any attempt to access the route fails with the same "503 Service Unavailable No server is available to handle this request." error. I am also seeing the LoadBalancer failure messages - eg Error creating load balancer (will retry): Error getting LB for service tdgsandf-cmd-nodejs/nodejs-mongodb-example: AccessDenied: User: arn:aws:iam::507479335359:user/cloud_provider is not authorized to perform: elasticloadbalancing:DescribeLoadBalancers status code: 403, request id: d3c8de66-92c8-11e6-b538-f3dcdae1fb63 I've also seen the Teardownnetwork message but don't have an example in my current logs/events. As Matt Googinis says above - this makes the Openshift Preview effectively useless at present as there is no means to interact with a running service from outside. Reproducible: Yes - always Steps to reproduce: clear any existing project follow the steps in the "basic walkthrough" - https://docs.openshift.com/online/getting_started/basic_walkthrough.html until step "Viewing your Running Application" Expected result: as per walkthrough documentation Actual result: "503 Service Unavailable No server is available to handle this request."
This appears to have been fixed (for me at least) overnight.
Me too! :-) My Service is now available! Thank you Red Hat!
Me too!Thank you!
We saw: E1013 23:52:33.630700 1 ratelimiter.go:52] error reloading router: wait: no child processes In the logs. Restarting the router fixed it, but we don't know the root cause yet.
(In reply to Ben Bennett from comment #16) > Jiří Fiala: Can you post the yaml for your route please? I'm sorry for the delay, here's the route in question: --- $ oc get route/cakephp -o yaml apiVersion: v1 kind: Route metadata: annotations: openshift.io/host.generated: "true" creationTimestamp: 2016-10-13T12:17:02Z labels: app: cakephp name: cakephp namespace: pub resourceVersion: "234435454" selfLink: /oapi/v1/namespaces/pub/routes/cakephp uid: f2ca8b63-913e-11e6-ae04-0ebeb1070c7f spec: host: cakephp-pub.44fs.preview.openshiftapps.com port: targetPort: 8080-tcp to: kind: Service name: cakephp weight: 100 status: ingress: - conditions: - lastTransitionTime: 2016-10-13T12:17:02Z status: "True" type: Admitted host: cakephp-pub.44fs.preview.openshiftapps.com routerName: router ---
I'm closing this because the router issue was resolved by Miciah.