Bug 844026 - on rhc app create, getting error: RESULT: 705: unexpected token at 'org.freedesktop.DBus.Error.NoReply: Did not receive a reply.
on rhc app create, getting error: RESULT: 705: unexpected token at 'org.freed...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Origin
Classification: Red Hat
Component: Pod (Show other bugs)
2.x
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Krishna Raman
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-27 20:43 EDT by Nam Duong
Modified: 2015-05-14 22:01 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-08-07 16:42:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Nam Duong 2012-07-27 20:43:07 EDT
Description of problem:
This is based on forum thread:  
https://openshift.redhat.com/community/forums/openshift/can-you-increase-the-timeout-value-for-when-running-rhc-app-create#comment-22765

Kevin is getting this intermittent error when running "rhc app create -a AppName-t jbossas-7 --config ~/.openshift/openshiftlocal.conf -d Password: *****":

Submitting form: debug: true rhlogin: admin Contacting https://openshift-local Creating application: AppName in demo Contacting https://openshift-local Problem reported from server. Response code was 500.

DEBUG: 705: unexpected token at 'org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. '/usr/lib/ruby/gems/1.8/gems/json-1.4.6/lib/json/common.rb:146:in parse' /usr/lib/ruby/gems/1.8/gems/json-1.4.6/lib/json/common.rb:146:inparse' /usr/lib/ruby/gems/1.8/gems/gearchanger-oddjob-plugin-0.8.4/lib/gearchanger-oddjob-plugin/gearchanger/oddjob_application_container_proxy.rb:423:in exec_command' /usr/lib/ruby/gems/1.8/gems/gearchanger-oddjob-plugin-0.8.4/lib/gearchanger-oddjob-plugin/gearchanger/oddjob_application_container_proxy.rb:367:inrun_cartridge_command' /usr/lib/ruby/gems/1.8/gems/gearchanger-oddjob-plugin-0.8.4/lib/gearchanger-oddjob-plugin/gearchanger/oddjob_application_container_proxy.rb:114:in configure_cartridge' /usr/lib/ruby/gems/1.8/gems/stickshift-controller-0.10.2/lib/stickshift-controller/app/models/gear.rb:57:inconfigure'

Exit Code: 1 api_c: placeholder broker_c: namespacerhloginsshapp_uuiddebugaltercartridgecart_typeactionapp_nameapi API version: 1.1.3

RESULT: 705: unexpected token at 'org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. '



NOTE:  
Adding various values for --timeout doesn't help.
Comment 1 John (J5) Palmieri 2012-07-30 10:05:21 EDT
This is an issue with D-Bus timing out during a long OddJob operation (most likely setting up JBoss which can take a bit of time).  I'm guessing our default timeouts do not take into account running on a less powerful machine via OpenShift origin.

I'm not sure what --timeout effects.  It might just be the local connection and not the various components on the broker that can timeout. Solution is for us to not rely on default d-bus timeouts and instead explicitly set them with every call in which we expect a reply.

The other fix is to make this all async and have OddJob send back a signal when a job is done.  Then we can control the timeouts.  I'm not sure if OddJob has that feature.  Nalin, the original author should know.
Comment 2 John (J5) Palmieri 2012-07-30 10:09:03 EDT
Nalin.  Adding you to this bug to see if you have any insight on it. If someone else is maintaining OddJob these days, please feel free to remove yourself and add them.  Thanks.
Comment 3 Nalin Dahyabhai 2012-07-30 10:28:52 EDT
The service doesn't currently provide for returning the result of a call via a signal.  It could be added, but that doesn't seem to me to be how D-Bus expects to be used, and most clients wouldn't handle it properly, so it'd have to be added as a non-default option.

Or did you mean to have the daemon send a unicast signal when a job terminates, in addition to the method reply, and have your client listen for that?  That could be done.  What identifying information from the call would the signal need to include in order for it to be distinguishable from other calls that might also be in-progress?
Comment 4 Krishna Raman 2012-07-30 13:31:12 EDT
We had faced similar issues with oddjob, so we have recently moved away from oddjob to m-collective.

Will be spinning a new version of the Fedora remix soon.
Comment 5 Nalin Dahyabhai 2012-07-30 15:25:08 EDT
(In reply to comment #4)
> We had faced similar issues with oddjob, so we have recently moved away from
> oddjob to m-collective.

I'd appreciate more detail on what those were, or a pointer to discussion about them, as I'd like to make oddjob more useful for more cases.  If not immediately, then as things to add to its to-do list.
Comment 6 Krishna Raman 2012-07-30 16:18:15 EDT
(In reply to comment #5)
> (In reply to comment #4)
> > We had faced similar issues with oddjob, so we have recently moved away from
> > oddjob to m-collective.
> 
> I'd appreciate more detail on what those were, or a pointer to discussion
> about them, as I'd like to make oddjob more useful for more cases.  If not
> immediately, then as things to add to its to-do list.

2 common issues I ran into:
1) Size of data returned from any command seemed to be limited to 8K
2) Oddjob was timing out in some cases and I didnt not find a good place to adjust the timeouts
Comment 7 Gaoyun Pei 2012-08-03 03:07:03 EDT
verified.

we are using m-collective instead of oddjob in the livecd now

[liveuser@broker ~]$ rhc app create -a app2 -t jbossas-7 --config openshift.conf
Password: *****

Creating application: app2 in pgy
Now your new domain name is being propagated worldwide (this might take a minute)...
The authenticity of host 'app2-pgy.example.com (10.66.10.69)' can't be established.
RSA key fingerprint is a3:55:db:5e:37:db:de:3d:bc:44:c4:34:1f:59:6c:92.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'app2-pgy.example.com' (RSA) to the list of known hosts.
Confirming application 'app2' is available:  Success!

app2 published:  http://app2-pgy.example.com/
git url:  ssh://36d5fe13adb34e28b0b928cc9e1f836b@app2-pgy.example.com/~/git/app2.git/
Successfully created application: app2
Comment 8 Nalin Dahyabhai 2012-08-08 17:54:38 EDT
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > We had faced similar issues with oddjob, so we have recently moved away from
> > > oddjob to m-collective.
> > 
> > I'd appreciate more detail on what those were, or a pointer to discussion
> > about them, as I'd like to make oddjob more useful for more cases.  If not
> > immediately, then as things to add to its to-do list.
> 
> 2 common issues I ran into:
> 1) Size of data returned from any command seemed to be limited to 8K

I wasn't able to reproduce this when I configured the daemon to use a trivial shell script as the helper for a method.

> 2) Oddjob was timing out in some cases and I didnt not find a good place to
> adjust the timeouts

How were you calling it?

Note You need to log in before you can comment on or make changes to this bug.