Bug 906751 - "Node execution failure" problem when creating new scalable apps after server upgrade and migration
Summary: "Node execution failure" problem when creating new scalable apps after server...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Rajat Chopra
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-01 11:33 UTC by Jianwei Hou
Modified: 2015-05-15 02:13 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-13 23:00:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
development.log (88.31 KB, text/x-log)
2013-02-01 11:33 UTC, Jianwei Hou
no flags Details
mcollective log (56.91 KB, text/x-log)
2013-02-01 11:33 UTC, Jianwei Hou
no flags Details

Description Jianwei Hou 2013-02-01 11:33:13 UTC
Created attachment 691550 [details]
development.log

Description of problem:
After server is updated and migrated, the "Node execution failure" error is continuously seen when creating new scalable applications.
However, creating non-scalable applications are ok

Version-Release number of selected component (if applicable):
Upgrading devenv-stage_281 to latest devenv
rhc-broker-1.4.3-1.el6_3.noarch
rhc-node-1.4.3-1.el6_3.x86_64

How reproducible:
Always(tested 3 times, reproduced 3 times)

Steps to Reproduce:
1. Launch devenv-stage_281 instance, create scalable apps against it.
2. SSH into instance, modify devenv.repo to candidate to upgrade to latest available devenv
sed -i 's/stage/candidate/g' /etc/yum.repos.d/devenv.repo
3. yum -y update
4. restart rhc-datastore since mongodb-server is updated as well
5. cd /var/www/openshift/broker; rake tmp:clear
6. Execute migrate-mongo-2.0.23 and migrate-dynect-2.0.23
7. rhc-admin-migrate --version 2.0.23(this script is broken by now, but will not affect reproducing the bug)
8. Create a new scalable application

  
Actual results:
[hjw@hjwlaptop devenv]$ rhc app create php2s php-5.3 -s -px
Application Options
-------------------
  Namespace:  281t1
  Cartridges: php-5.3
  Gear Size:  default
  Scaling:    yes

Creating application 'php2s' ... Node execution failure (invalid exit code from node).  If the problem persists please contact Red
Hat support.


Expected results:
Should not fail.

Additional info:
attached development.log and mcollective.log

Comment 1 Jianwei Hou 2013-02-01 11:33:38 UTC
Created attachment 691551 [details]
mcollective log

Comment 2 Dan Mace 2013-02-01 19:30:00 UTC
The sequence of events in the MCollective logs indicate some sort of broker issue:

1. The broker sends an app-create message which is processed successfully by the node library, and a response is sent back to the broker.
2. The broker then sends a duplicate app-create message. The node library fails to create the application as the gear user already exists as of step 1, which is correct behavior.
3. The broker receives the failed reply from step 2 and rolls back app creation, deleting the skeletal app and user created in step 1 due to the message payload duplication.

The overall failure appears to be due to the duplicate app-create messages sent in sequence, and is why I am reassigning this to the broker team for further analysis.

Comment 3 Rajat Chopra 2013-02-02 00:25:04 UTC
Found the issue. The steps need 'service mcollective restart' on the nodes before broker cache is cleared.
Since the cartridge model has changed, mcollective needs to reload new models to gather data about new fields.

Comment 4 Jianwei Hou 2013-02-04 07:23:24 UTC
Problem solved!

Have to restart mcollective before broker cache is cleared. Then I was able to create scalable applications.
Moving this bug to verified


Note You need to log in before you can comment on or make changes to this bug.