Bug 1009257 - Creating application fails with Node execution failure (invalid exit code from node)
Summary: Creating application fails with Node execution failure (invalid exit code fro...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Krishna Raman
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-18 05:23 UTC by Jan Pazdziora
Modified: 2022-08-04 02:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-30 00:46:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jan Pazdziora 2013-09-18 05:23:47 UTC
Description of problem:

On my OpenShift Origin all-on-one installation on Fedora 19, attempt to create application fails.

Version-Release number of selected component (if applicable):

# rpm -qa 'openshift*' | sort
openshift-origin-broker-1.15.1-1.git.0.c9e7efd.fc19.noarch
openshift-origin-broker-util-1.15.0-1.git.72.adf3f1f.fc19.noarch
openshift-origin-cartridge-10gen-mms-agent-1.28.1-1.git.0.90178ea.fc19.noarch
openshift-origin-cartridge-cron-1.15.1-1.git.0.151d653.fc19.noarch
openshift-origin-cartridge-diy-1.15.1-1.git.0.38dd66a.fc19.noarch
openshift-origin-cartridge-haproxy-1.15.1-1.git.0.fda4f4c.fc19.noarch
openshift-origin-cartridge-jenkins-1.15.0-1.git.11.238a070.fc19.noarch
openshift-origin-cartridge-jenkins-client-1.15.1-1.git.0.70e8836.fc19.noarch
openshift-origin-cartridge-mariadb-1.15.1-1.git.0.d162f22.fc19.noarch
openshift-origin-cartridge-mock-1.15.1-1.git.0.a950dfe.fc19.noarch
openshift-origin-cartridge-mock-plugin-1.15.1-1.git.0.d7f3ca7.fc19.noarch
openshift-origin-cartridge-mongodb-1.15.1-1.git.0.20dd053.fc19.noarch
openshift-origin-cartridge-nodejs-1.16.0-1.git.3.238a070.fc19.noarch
openshift-origin-cartridge-perl-1.15.1-1.git.0.5f169b6.fc19.noarch
openshift-origin-cartridge-php-1.15.1-1.git.0.55909c2.fc19.noarch
openshift-origin-cartridge-phpmyadmin-1.15.0-1.git.1.238a070.fc19.noarch
openshift-origin-cartridge-postgresql-1.15.1-1.git.0.2f20430.fc19.noarch
openshift-origin-cartridge-python-1.15.1-1.git.0.0eb3e95.fc19.noarch
openshift-origin-cartridge-ruby-1.15.1-1.git.0.b36bf57.fc19.noarch
openshift-origin-console-1.15.1-1.git.0.7d1fa87.fc19.noarch
openshift-origin-msg-common-1.15.0-1.git.228.238a070.fc19.noarch
openshift-origin-msg-node-mcollective-1.15.0-1.git.12.238a070.fc19.noarch
openshift-origin-node-proxy-1.15.1-1.git.0.ce0a2d8.fc19.noarch
openshift-origin-node-util-1.15.0-1.git.50.238a070.fc19.noarch
openshift-origin-port-proxy-1.8.1-1.git.0.81dc98c.fc19.noarch
openshift-origin-util-1.15.1-1.git.0.36a339b.fc19.noarch

How reproducible:

Deterministic.

Steps to Reproduce:
1. Have OpenShift Origin on Fedora 19.
2. Run rhc create-app -n test -a perl25788 -t perl-5.16

Actual results:

# rhc create-app -n test -a perl25788 -t perl-5.16
Application Options
-------------------
  Namespace:  test
  Cartridges: perl-5.16
  Gear Size:  default
  Scaling:    no

Creating application 'perl25788' ... 
Unable to complete the requested operation due to: Node execution failure (invalid exit code from node)..
Reference ID: edf932bf7dc78761c8af6e906f118831

Expected results:

No error, application created, started, and available.

Additional info:

grep rc= /var/log/openshift/node/platform.log | grep -v rc=0

shows

September 17 07:00:08 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 17 07:00:08 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 17 07:16:46 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 17 07:16:46 INFO Shell command '/usr/bin/pkill -9 -u 500' ran. rc=1 out=
September 17 07:16:46 INFO Shell command '/usr/bin/pgrep -u 500' ran. rc=1 out=
September 17 07:16:46 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 17 07:16:46 INFO Shell command '/usr/bin/pkill -9 -u 500' ran. rc=1 out=
September 17 07:16:46 INFO Shell command '/usr/bin/pgrep -u 500' ran. rc=1 out=
September 17 07:16:47 INFO Shell command 'setquota --always-resolve -u 52383a186892df30a1000007 0 0 0 0 -a /' ran. rc=1 out=
September 18 01:19:21 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 18 01:19:21 INFO Shell command '/usr/bin/pkill -9 -u 500' ran. rc=1 out=
September 18 01:19:21 INFO Shell command '/usr/bin/pgrep -u 500' ran. rc=1 out=
September 18 01:19:21 INFO Shell command 'ip link show dev eth0' ran. rc=1 out=
September 18 01:19:21 INFO Shell command '/usr/bin/pkill -9 -u 500' ran. rc=1 out=
September 18 01:19:21 INFO Shell command '/usr/bin/pgrep -u 500' ran. rc=1 out=
September 18 01:19:22 INFO Shell command 'setquota --always-resolve -u 523937d36892df30a100001b 0 0 0 0 -a /' ran. rc=1 out=

Comment 2 Jhon Honce 2013-09-18 15:17:27 UTC
Thank you for providing the logs.  It appears that disk quota's are not setup on your image

September 18 01:19:22 INFO Shell command 'setquota --always-resolve -u 523937d36892df30a100001b 0 0 0 0 -a /' ran. rc=1 out

Additionally, 

1) Does user 52383a186892df30a1000007 have uid == 500?
2) eth0 is not what I would expect for a fedora ethernet interface, you can verify with ifconfig -a for the interface names.

Comment 3 Jan Pazdziora 2013-09-19 22:34:47 UTC
(In reply to Jhon Honce from comment #2)
> Thank you for providing the logs.  It appears that disk quota's are not
> setup on your image
> 
> September 18 01:19:22 INFO Shell command 'setquota --always-resolve -u
> 523937d36892df30a100001b 0 0 0 0 -a /' ran. rc=1 out
> 
> Additionally, 
> 
> 1) Does user 52383a186892df30a1000007 have uid == 500?

It's hard to tell for sure because after the failed application creating, the user is not there in /etc/passwd to test. But this being Fedora, I would assume 1000 or 1001 or something similar, unless OpenShift explicitly sets this to 500.

> 2) eth0 is not what I would expect for a fedora ethernet interface, you can
> verify with ifconfig -a for the interface names.

There is no eth0 interface on the machine if this is what you ask.

Comment 4 Krishna Raman 2013-12-09 20:55:50 UTC
You need to set conf_node_external_eth_dev in the puppet script with the correct interface to link against. 

Additional docs here: https://github.com/openshift/puppet-openshift_origin

Does this fix the installation for you?

Comment 5 Jan Pazdziora 2013-12-10 13:52:27 UTC
When using http://openshift.github.io/documentation/oo_deployment_guide_puppet.html#configuring-an-all-in-one-host things now do not fail with this error and application gets created.

However, please note that the existence of eth0 is by no means guaranteed on latest Fedoras, so the puppet scripts might want to use some different method for getting the best default network interface than eth0.

Comment 7 Krishna Raman 2013-12-11 05:59:58 UTC
You are right, eth0 is not a reliable but unfortunately I don't think there is a good way for the puppet script to guess what the correct interface is.

We also have the oo-install script which already has logic to introspect the machine and create an appropriate puppet script. Perhaps that is a better way to go rather than build this logic into puppet module itself.

Are you ok with me marking this bug as fixed?

Comment 8 Jan Pazdziora 2013-12-11 06:35:01 UTC
(In reply to Krishna Raman from comment #7)
> You are right, eth0 is not a reliable but unfortunately I don't think there
> is a good way for the puppet script to guess what the correct interface is.

The default does not necessarily need to be correct in 100 % cases, just good enough not to fail in typical scenarios.

Can't puppet run the equivalent of

  ip route | perl -lane 'if ($F[0] eq "default") { print $F[4]; exit }'

to get the default?

> We also have the oo-install script which already has logic to introspect the
> machine and create an appropriate puppet script. Perhaps that is a better
> way to go rather than build this logic into puppet module itself.

But oo-install is just for all-on-one cases (and adding node), it does not support individual components on separate machines like puppet does, doesn't it?

If oo-install / install.openshift.com is now the preferred way, could you amend http://openshift.github.io/ to have it as the first option instead of "You can also build your own machine using Puppet"? I'll gladly switch to oo-install with my testing but I'd like to use what typical external users are using (if they don't download the whole images).

> Are you ok with me marking this bug as fixed?

Sure.

Comment 9 Meng Bo 2013-12-13 07:36:19 UTC
# rhc create-app -a perl25788 -t perl-5.16 --no-git
Application Options
-------------------
Domain:     bmeng
Cartridges: perl-5.16
Gear Size:  default
Scaling:    no

Creating application 'perl25788' ... done


Waiting for your DNS name to be available ... done

Your application 'perl25788' is now available.

  URL:        http://perl25788-bmeng.example.com/
  SSH to:     52aab8616892df4b7d0000a3.com
  Git remote: ssh://52aab8616892df4b7d0000a3.com/~/git/perl25788.git/

Run 'rhc show-app perl25788' for more details about your app.


# rpm -qa  openshift* | sort
openshift-origin-broker-1.15.1-1.git.1671.ed159d4.fc19.noarch
openshift-origin-broker-util-1.18.0-1.git.112.114cfe1.fc19.noarch
openshift-origin-cartridge-10gen-mms-agent-1.29.1-1.git.0.cf31fb6.fc19.noarch
openshift-origin-cartridge-cron-1.17.0-1.git.38.caadcbc.fc19.noarch
openshift-origin-cartridge-diy-1.16.1-1.git.244.dec301c.fc19.noarch
openshift-origin-cartridge-haproxy-1.18.0-1.git.16.2070d75.fc19.noarch
openshift-origin-cartridge-jenkins-1.16.1-1.git.0.4a9d30c.fc19.noarch
openshift-origin-cartridge-jenkins-client-1.17.1-1.git.0.d44b524.fc19.noarch
openshift-origin-cartridge-mariadb-1.15.1-1.git.1188.1c787e7.fc19.noarch
openshift-origin-cartridge-mock-1.16.1-1.git.0.c2ab507.fc19.noarch
openshift-origin-cartridge-mock-plugin-1.16.1-1.git.0.a2d62d0.fc19.noarch
openshift-origin-cartridge-mongodb-1.17.1-1.git.0.bf4143e.fc19.noarch
openshift-origin-cartridge-nodejs-1.19.0-1.git.86.c257194.fc19.noarch
openshift-origin-cartridge-perl-1.17.0-1.git.60.dec301c.fc19.noarch
openshift-origin-cartridge-php-1.18.0-1.git.61.dec301c.fc19.noarch
openshift-origin-cartridge-phpmyadmin-1.17.1-1.git.0.5836a90.fc19.noarch
openshift-origin-cartridge-postgresql-1.18.0-1.git.58.bbe5a74.fc19.noarch
openshift-origin-cartridge-python-1.18.0-1.git.59.dec301c.fc19.noarch
openshift-origin-cartridge-ruby-1.18.0-1.git.62.dec301c.fc19.noarch
openshift-origin-console-1.15.1-1.git.1620.0550ff7.fc19.noarch
openshift-origin-msg-common-1.17.0-1.git.203.caadcbc.fc19.noarch
openshift-origin-msg-node-mcollective-1.18.0-1.git.72.70fc181.fc19.noarch
openshift-origin-node-proxy-1.17.0-1.git.230.63afa8c.fc19.noarch
openshift-origin-node-util-1.18.0-1.git.130.772cb11.fc19.noarch
openshift-origin-util-1.15.1-1.git.54.ea95274.fc19.noarch

Comment 11 Jan Pazdziora 2013-12-19 06:08:50 UTC
What does

   ifconfig eth0

return on your Fedora 19 OpenShift machine?


Note You need to log in before you can comment on or make changes to this bug.