Bug 973718 - Rest API returns partially created app objects without core attributes
Summary: Rest API returns partially created app objects without core attributes
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Master
Version: 2.x
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Lili Nader
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-12 14:41 UTC by Clayton Coleman
Modified: 2015-05-15 00:54 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-08-08 14:33:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
broker log (10.63 KB, text/plain)
2013-06-27 07:03 UTC, Xiaoli Tian
no flags Details
full broker log (229.93 KB, text/plain)
2013-06-27 07:13 UTC, Xiaoli Tian
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 971136 0 medium CLOSED [web-console] Partially created/deleted app makes my applications page not accessible 2021-02-22 00:41:40 UTC

Internal Links: 971136

Description Clayton Coleman 2013-06-12 14:41:35 UTC
Partially populated applications are returned by the REST API if a user accesses the REST API after creation but before the create completes.  The framework attribute should ALWAYS be returned by the REST API if the application exists - there should never be any point where the application record exists in mongo (and is returnable by the REST API) where attributes that are required and always present are not returned.

Doing so breaks clients in unpredictable ways and breaks our API contract.  It's acceptable for distributed resources to be inaccessible.

I added a repro scenario in https://github.com/openshift/origin-server/pull/2823.  Other attributes may be missing as well, they all need to be present.

=== From Kenny in bug 971136

I researched this and found that he had a partially deleted application in his account.  I did a search in the mcollective logs and found that his application had been destroyed:

I, [2013-06-05T13:56:44.292725 #31226]  INFO -- : openshift.rb:84:in `execute_action' Executing action [app-destroy] using method oo_app_destroy with args [{"--with-app-uuid"=>"51af78125973caa01100031e", "--with-app-name"=>"rssoverwolf", "--with-container-uuid"=>"51af78125973caa01100031e", "--with-container-name"=>"rssoverwolf", "--with-namespace"=>"ss2982", "--with-uid"=>6164, "--with-request-id"=>"62fd9ccb8915917409865e7b079507fe", "--cart-name"=>"openshift-origin-node"}]

Therefore I knew that his application just needed to be removed from the database.  This was fixed and the user was then able to access the applications page.

Specifically these lines below:

2013-06-05 14:01:12.187 [ERROR] Unhandled exception reference #0773dea08400c55880617881477e2aa8: undefined method `framework' for #<Application:0x000000060024d0>

Here was the error/output from the console log: 

2013-06-05 14:01:09.500 [DEBUG] Using cached domain vn (pid:19750)
2013-06-05 14:01:09.601 [DEBUG] ^[[1m^[[34mOpenShift API (98.5ms)^[[0m ^[[1m^[[1mget^[[0m https://localhost:443/broker/rest/user/keys.json [ code: ^[[1m^[[1m200^[[0m ] (pid:19750)
2013-06-05 14:01:09.613 [DEBUG] ^[[1m^[[34mOpenShift API (95.5ms)^[[0m ^[[1m^[[1mget^[[0m https://localhost:443/broker/rest/user/authorizations.json [ code: ^[[1m^[[1m200^[[0m ] (pid:19750)
2013-06-05 14:01:09.616 [INFO ] Rendered /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-console-1.9.11/app/views/settings/_keys.html.haml (1.3ms) (pid:19750)
2013-06-05 14:01:09.617 [INFO ] Rendered /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-console-1.9.11/app/views/settings/_domains.html.haml (0.4ms) (pid:19750)
2013-06-05 14:01:09.619 [INFO ] Rendered /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-console-1.9.11/app/views/settings/_authorizations.html.haml (1.8ms) (pid:19750)
2013-06-05 14:01:09.619 [INFO ] Rendered /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-console-1.9.11/app/views/settings/show.html.haml within layouts/console (4.7ms) (pid:19750)
2013-06-05 14:01:09.621 [INFO ] Rendered /opt/rh/ruby193/root/usr/share/gems/gems


===== From Thomas

We're seeing this bug in PROD because of our app create loop checks. It seems that when the console gets the app list back, some of the apps are either half created or half deleted which causes this bug to show.

The problem goes away once the app has been fully created / deleted.

Comment 2 Xiaoli Tian 2013-06-14 08:10:24 UTC
Tried it without the fix, but can not reproduce the problem in comment 0, 

If removing the group_instances or component instance content, 
libra_rs:PRIMARY> db.applications.update ({name: "phpapp1"},{$set: {group_instances:""}} )
libra_rs:PRIMARY> db.applications.find ({name: "phpapp1"},{group_instances:1} )
{ "_id" : ObjectId("51bab5371415e5f8ef000038"), "group_instances" : "" }

rhc domain-show , rhc app-show and web console  will show part of the app information:
  phpapp1 @ http://phpapp1-domx1.dev.rhcloud.com/ (uuid: efe1f106d4b911e2815412313d06d0fb)
  ----------------------------------------------------------------------------------------
    Created: 2:16 AM
    Gears:   0 (defaults to small)
    Git URL: ssh:///~/git/phpapp1.git/
    SSH:     ssh://


After the fix in comment 1, it will show the following  message if trying to show the app from rhc or web console:
[root@ip-10-112-223-9 node]# rhc app show -a phpapp1
Application 'phpapp1' not found for domain 'domx1'

And all the half created or deleted app from mongo will not be shown from web console, rhc domain show as well.


If all the data in mongo is complete, only the data in node is removed:
[root@ip-10-112-223-9 node]# oo-app-destroy --with-container-uuid  446007632237752334942208  --with-container-name phpapp2 --with-app-uuid 446007632237752334942208  --with-app-name phpapp2  --with-namespace domx1
[root@ip-10-112-223-9 node]# ls -lh /var/lib/openshift/446007632237752334942208
ls: cannot access /var/lib/openshift/446007632237752334942208: No such file or directory

It could still show correctly from rhc and webconsole:
# rhc app show -a phpapp2
phpapp2 @ http://phpapp2-domx1.dev.rhcloud.com/ (uuid: 446007632237752334942208)
--------------------------------------------------------------------------------
  Created: 3:54 AM
  Gears:   1 (defaults to small)
  Git URL: ssh://446007632237752334942208.rhcloud.com/~/git/phpapp2.git/
  SSH:     446007632237752334942208.rhcloud.com

  php-5.3 (PHP 5.3)
  -----------------
    Gears: 1 small

Please help to confirm if above is expected.

Thanks

Comment 3 Xiaoli Tian 2013-06-14 08:54:43 UTC
Or try to simulate an app partially created by comment put  #self.run_jobs(result_io) for add_features function in 
 /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.10.1/app/models/application.rb 

It will create an app without group_instances and component_instances as well, which will get the same result as comment 2:

The app will not be shown after the fix ( latest devenv_3360), and will be shown partially without the fix.

Comment 4 Lili Nader 2013-06-14 17:55:45 UTC
This fix does not address all the partially created scenarios. The broker team is looking into a more comprehensive fix later.  However, even that fix will not address cases where the mongo record has been changed manually.  

The eventual fix will be, while the application is being created, the rest api should return a 404 "not found" error.  Only return app from a get request when the app is completely created.

This quick fix will check that at least the first gear has been created before returning the app.

To verify this bug create and app using rhc, console or rest api.  While it is being created try to retrieve the same app using rhc or rest.  You should get a 404 "not found" message.

Comment 5 Xiaoli Tian 2013-06-17 06:59:56 UTC
According to comment 4 and comment 3, comment 2, this bug is fixed.

Comment 6 openshift-github-bot 2013-06-26 01:07:28 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/5992b610359b66bea8b0f78f4a1eb8cfc59ab911
Reverting fix for bug 973718
 - instead of not returning the broken apps, we are relying on the CLI/console resilience to handle these broken apps for now

Comment 7 Abhishek Gupta 2013-06-26 20:41:05 UTC
The fix for this bug was reverted. Now, instead of not returning the broken applications, we are relying on the CLI/console resilience to handle these broken applications.

Please re-verify this bug based on this change.

Comment 8 zhaozhanqi 2013-06-27 05:11:42 UTC
Tested this bug on devenv_3419, it has been fixed.

the step is :

  open two tabs. one is create one app, the other retrieve the same app using rest api
1, create one app 
  rhc app create zqjbossews6 python-3.3 -s
2. after run step 1, run the following command immediately
 
[zqzhao@dhcp-13-222 myshell]$ curl -w %{http_code} -k -H 'Accept:application/xml' --user zzhao:redhat https://ec2-54-235-3-152.compute-1.amazonaws.com/broker/rest/domains/zqd/applications/zqjbossews6
<?xml version="1.0" encoding="UTF-8"?>
<response>
  <status>not_found</status>
  <type nil="true"></type>
  <data>
    <datum nil="true"></datum>
  </data>
  <messages>
    <message>
      <severity>error</severity>
      <text>Application 'zqjbossews6' not found for domain 'zqd'</text>
      <exit-code>101</exit-code>
      <field nil="true"></field>
    </message>
  </messages>
  <version>1.5</version>
  <api-version>1.5</api-version>
  <supported-api-versions>
    <supported-api-version>1.0</supported-api-version>
    <supported-api-version>1.1</supported-api-version>
    <supported-api-version>1.2</supported-api-version>
    <supported-api-version>1.3</supported-api-version>
    <supported-api-version>1.4</supported-api-version>
    <supported-api-version>1.5</supported-api-version>
  </supported-api-versions>
</response>
404

Comment 9 Xiaoli Tian 2013-06-27 06:28:43 UTC
Test it on devenv_3419:

1) Try to create a broken app which has not finished creation by commenting out # self.run_jobs(result_io)  in /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.11.1/app/models/application.rb +488

After creating app, the app is partially created, rhc will return the partially created app,  my application page in web console  could show the partically created app as well:

# rhc app show phpapp1
phpapp1 @ http://phpapp1-domx1.dev.rhcloud.com/ (uuid: 577579901829902900396032)
--------------------------------------------------------------------------------
  Created: 2:02 AM
  Gears:   0 (defaults to small)
  Git URL: ssh:///~/git/phpapp1.git/
  SSH:     ssh://

2) If trying to add another cartridge to the partially created app, it will return the following error message from rhc and web console
# rhc cartridge-add -a phpapp1 mysql-5.1
Adding mysql-5.1 to application 'phpapp1' ... 
Unable to complete the requested operation due to: 
Problem:
  Document not found for class ComponentInstance with attributes {:cartridge_name=>"mysql-5.1", :component_name=>"mysql-5.1"}.
Summary:
  When calling ComponentInstance.find_by with a hash of attributes, all attributes provided must match a document in the database or this error will be raised.
Resolution:
  Search for attributes that are in the database or set the Mongoid.raise_not_found_error configuration option to false, which will cause a nil to be returned instead of raising this error..
Reference ID: b044e836b0d79eda90753ba0ce075f42

3) The app could be deleted via rhc

4) If skip step 3, and trying to run oo-admin-clear-pending-ops -u  51cbd961fa61081840000002 -t 0.001

Executing op for app (51cbd961fa61081840000002) - #<PendingAppOpGroup _id: 51cbd962fa6108cc46000051, _type: nil, created_at: 2013-06-27 06:19:14 UTC, updated_at: 2013-06-27 06:19:14 UTC, op_type: :add_features, args: {"features"=>["php-5.3"], "group_overrides"=>[], "init_git_url"=>nil}, parent_op_id: nil, num_gears_added: 0, num_gears_removed: 0, num_gears_created: 0, num_gears_destroyed: 0, num_gears_rolled_back: 0, user_agent: "rhc/1.11.1 (ruby 1.8.7; x86_64-linux) (2.3.2, ruby 1.8.7 (2011-06-30) [x86_64-linux])"> 

The app will be deleted as well.

For step 2, should we return more meaningful error message to user like "your app is partially created/deleted, you could delete your app and re-create it etc."

Not sure if it should be fixed from RESTAPI or rhc/web console, move this  bug back for devel to confirm.

Comment 10 Xiaoli Tian 2013-06-27 07:01:47 UTC
Btw, in step 3, while deleting the broken app, it will trigger the remaining operations left from step 2 ( trying to add mysql to the app) : not sure why it's needed.

# rhc app-delete -a phpapp4 
This is a non-reversible action! Your application code and data will be permanently deleted if you continue!

Are you sure you want to delete the application 'phpapp4'? (yes|no): yes

Deleting application 'phpapp4' ... deleted

Application phpapp4 is deleted.
The cartridge php deployed a template application

mysql-5.1: Connection URL: mysql://$OPENSHIFT_MYSQL_DB_HOST:$OPENSHIFT_MYSQL_DB_PORT

Starting Apache+mod_php HTTPD server

MySQL 5.1 database added.  Please make note of these credentials:
       Root User: adminyV3axgv
   Root Password: WtsncCVp8iJW
   Database Name: phpapp4
Connection URL: mysql://$OPENSHIFT_MYSQL_DB_HOST:$OPENSHIFT_MYSQL_DB_PORT/
You can manage your new MySQL database by also embedding phpmyadmin-3.4.
The phpmyadmin username and password will be the same as the MySQL credentials above.

Comment 11 Xiaoli Tian 2013-06-27 07:03:42 UTC
Created attachment 765964 [details]
broker log

Attached is broker log for the whole steps.

Comment 12 Xiaoli Tian 2013-06-27 07:13:10 UTC
Created attachment 765965 [details]
full broker log

sorry, attachment in previous comment is not complete.

Comment 13 Lili Nader 2013-07-26 17:59:41 UTC
- An app record will never be saved to Mongo unless it has at least one component instance.

- Also in application delete should never leave application in a state where there is no component instance in the app record.


Note You need to log in before you can comment on or make changes to this bug.