Bug 1230483 - Get 503 error when connect the javascript console for java pod which has jolokia agent running if Master is out of SDN
Summary: Get 503 error when connect the javascript console for java pod which has jolo...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Brenton Leanhardt
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-11 03:25 UTC by zhou ying
Modified: 2015-11-23 14:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-23 14:44:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The screenshot of error. (345.87 KB, image/png)
2015-06-11 03:25 UTC, zhou ying
no flags Details
console log (287.55 KB, image/png)
2015-06-22 04:36 UTC, chunchen
no flags Details
console log (204.35 KB, image/png)
2015-06-22 04:37 UTC, chunchen
no flags Details
OK for java console (136.38 KB, image/png)
2015-06-22 07:30 UTC, chunchen
no flags Details

Description zhou ying 2015-06-11 03:25:38 UTC
Created attachment 1037476 [details]
The screenshot of error.

Description of problem:
When create a java pod from image:fabric8/fabric8/quickstart-java-simple-mainclass:2.1.9, wait untill the pod and container were running,on webconsole connect into the java pod, could not show the JVM details.
The post request:https://master.cluster.local:8443/api/v1beta3/namespaces/zhouy/pods/quickstart-java-simple-mainclass-6ngb3:8778/proxy/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false get response:503 Service Unavailable

Version-Release number of selected component (if applicable):
openshift v0.6.0.0-55-g733cf86
kubernetes v0.17.1-804-g496be63

How reproducible:
always

Steps to Reproduce:
1. Use command to create a java container:
`oc create -f http://repo1.maven.org/maven2/io/fabric8/jube/images/fabric8/quickstart-java-simple-mainclass/2.1.9/quickstart-java-simple-mainclass-2.1.9-kubernetes.json`
2. Wait untill the pod and container were running status,login webconsole ,select the right project and connect into the java pod.


Actual results:
On the javascript console nothing was showing.

Expected results:
Should show the  JVM trees and other java plugin details.

Additional info:
When enter the container can see the jolokia agent process.This works well on fedora instance.

Comment 2 stlewis@redhat.com 2015-06-11 19:16:38 UTC
Just to double-check, does the camel-spring or camel-cdi quickstart work alright for you?

Comment 3 stlewis@redhat.com 2015-06-11 20:44:58 UTC
Tried reproducing using the steps provided, works here.  Can you make sure you've cleaned your browser cache just to be on the safe side?  Also, while in the hawtio console you can clear your local storage, click on the 'User' menu at the top-right, select 'Preferences' and then select the 'Reset' tab on the preferences page, click the (I think) red button to clear your local storage too.

Comment 4 zhou ying 2015-06-12 05:42:45 UTC
@Stan Lewis, I've tried with your suggestion, still got 503 error in OSE env, but on fodera instance does not have this issue.  Could you please try on OSE env ?

Comment 5 zhou ying 2015-06-12 06:24:52 UTC
@Stan Lewis, tested by camel-spring, activemq, java-simple-mainclass.

Comment 6 stlewis@redhat.com 2015-06-12 12:36:23 UTC
@zhou ying, any docs you can link me to for setting that up?  Thanks!

Comment 7 stlewis@redhat.com 2015-06-12 12:39:35 UTC
Also, could it be this PR hasn't made it into the build you're testing -> https://github.com/openshift/origin/pull/2802 as that's the behavior seen before that fix was put in.

Comment 9 openshift-github-bot 2015-06-15 22:14:22 UTC
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/641a56b7250bc5350f74877fd5859d3699b09439
Updates to address part of bug 1230483

Update openshift-jvm to 1.0.20

Change icon/title attribute and move to the right place

Remove absolute positioning

Comment 10 stlewis@redhat.com 2015-06-16 12:12:02 UTC
Ah, the above should have been for 1231605, sorry

Comment 11 stlewis@redhat.com 2015-06-17 13:17:17 UTC
K, have a VM with RHEL 7 and the openshift RPMs you linked me to.  Installed the java-simple-mainclass quickstart, clicked the connect button and the console was able to successfully establish a connection to Jolokia.  So I don't seem to be able to reproduce it locally still.

Could send along a disk image of what you've got setup and I can try running it here so we're on the same page...

Comment 12 zhou ying 2015-06-17 14:45:33 UTC

@Stan Lewis, I just send email to you about  today's OSE env , you can use it to reproduce this issue.

Comment 13 stlewis@redhat.com 2015-06-17 15:04:46 UTC
Response from proxy ->

Error: 'dial tcp 10.1.0.40:8778: i/o timeout'
Trying to reach: 'http://10.1.0.40:8778/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false'

From master.cluster.local directly trying to curl:

curl "http://10.1.0.40:8778/jolokia/?maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false"
curl: (7) Failed to connect to 10.1.0.40 port 8778: Connection timed out

Comment 14 Dan McPherson 2015-06-17 19:22:04 UTC
Investigation showed the environment didn't have the master on the same network (sdn) as the nodes.

Comment 15 zhou ying 2015-06-18 06:06:08 UTC
The 503 error only occurred when the env was multi-nodes, the reason is that on node there was vxlan ,thus the pods on different node can connect with each others.
But on master machine, there was not vxlan exist, so on the master machine can not connect the pod deployed on node. For the purpose of access pod from master, we should use service and route. I've tried by the json:http://fpaste.org/233414/43460686/, to create pod,service,route. Then I use command curl on master can get the response, please see the details: http://fpaste.org/233415/07068143/

Comment 16 zhou ying 2015-06-18 06:16:16 UTC
@Stan Lewis, When I add service and route for pod, on master I can use curl to get response, but on Webconsole, the 503 error still exist. So I think maybe the connect url need to update. 

Here is doc about route: https://github.com/openshift/origin/blob/master/docs/routing.md

Comment 17 stlewis@redhat.com 2015-06-18 12:26:02 UTC
This would take a bit more work to support this use-case, as I suspect we'd have to define a reasonable annotation on the service that the console would need to use for discovery.  Work would also require relating the pod to the service then as well.  That being said it also then requires exposing the jolokia port for each pod to the outside, which I suspect most users would not want to do really as a safety precaution, even with jolokia secured.

Comment 18 Scott Dodson 2015-06-19 13:12:24 UTC
Xiaoli,

The way we're going to address this for now is to make the master also be a node however make it unschedulable so it doesn't get any pods. In the future we'll make the master join the SDN without the additional overhead of running the node components. So for testing purposes provision the master as if it were a node and then make it unschedulable. We're working to automate this via ansible.

# oadm manage-node ose3-master.example.com --schedulable=false

Comment 20 chunchen 2015-06-22 04:36:02 UTC
Created attachment 1041579 [details]
console log

Comment 21 chunchen 2015-06-22 04:37:45 UTC
Created attachment 1041580 [details]
console log

Comment 22 chunchen 2015-06-22 04:45:25 UTC
(In reply to chunchen from comment #19)

After make the master as a node, can get below messages:

[root@master ~]# ps -ef |grep node
root       6965      1  3 11:33 ?        00:02:14 /usr/bin/openshift start node --config=/etc/openshift/node/node-config.yaml --loglevel=4

1. For schedulable=false:
[root@master ~]# oc get node
NAME                    LABELS                                         STATUS
master.cluster.local    kubernetes.io/hostname=master.cluster.local    Ready,SchedulingDisabled
minion1.cluster.local   kubernetes.io/hostname=minion1.cluster.local   Ready
minion2.cluster.local   kubernetes.io/hostname=minion2.cluster.local   Ready

2. For schedulable=true:
[root@master ~]# oc get node
NAME                    LABELS                                         STATUS
master.cluster.local    kubernetes.io/hostname=master.cluster.local    Ready
minion1.cluster.local   kubernetes.io/hostname=minion1.cluster.local   Ready
minion2.cluster.local   kubernetes.io/hostname=minion2.cluster.local   Ready

Comment 24 chunchen 2015-06-22 07:30:17 UTC
Created attachment 1041610 [details]
OK for java console

Comment 26 zhou ying 2015-06-23 03:09:10 UTC
Also confirmed on the env according to https://trello.com/c/KtD20Ei8/341-2-admin-can-install-openshift-using-a-python-wrapper-around-the-ansible-tooling-with-no-additional-dependencies-beyond-base-rhel.

[root@master ~]# oc get nodes
NAME                   LABELS                                        STATUS
master.cluster.local   kubernetes.io/hostname=master.cluster.local   Ready,SchedulingDisabled
minion.cluster.local   kubernetes.io/hostname=minion.cluster.local   Ready

openshift version
openshift v3.0.0.0-32-g3ae1d27
kubernetes v0.17.1-804-g496be63

Comment 27 Ben Parees 2015-07-22 20:02:08 UTC
*** Bug 1243317 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.