Bug 983034

Summary: Met script error by running oo-admin-chk while Component Instances is missing
Product: OpenShift Online Reporter: zhaozhanqi <zzhao>
Component: PodAssignee: Abhishek Gupta <abhgupta>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: jhou, rpenta, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-22 15:23:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhaozhanqi 2013-07-10 10:46:10 UTC
Description of problem:
create any type app.Then delete the app's "componnent instances" in the mongo db and run oo-admin-chk
# oo-admin-chk -l 1
Started at: 2013-07-10 06:19:49 -0400
/usr/sbin/oo-admin-chk:320:in `block in <main>': undefined method `each' for nil:NilClass (NoMethodError)
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.11.6/lib/openshift/data_store.rb:24:in `block (2 levels) in find'

it should not show script error by running 'oo-admin-chk -l 1'.

Version-Release number of selected component (if applicable):
devenv_3473

How reproducible:
always

Steps to Reproduce:
1. create one app
2. access the rockmongodb by  https://<devenv_ip>/datastore  and delete the  component_instance of applications.
  eg, delete below,
     "component_instances": [
      {
        "_id": ObjectId("514992c6860c5d8ec5000018"),
      "component_properties": [
          
        ],
        "cartridge_name": "php-5.3",
      "component_name": "php-5.3",
      "group_instance_id": ObjectId("514992c6860c5d8ec5000014")
      }
      ],
3. ssh into instance run 'oo-admin-chk -l 1'


Actual results:
[root@ip-10-152-185-203 runtime]# oo-admin-chk -l 1
Started at: 2013-07-10 06:19:49 -0400
/usr/sbin/oo-admin-chk:320:in `block in <main>': undefined method `each' for nil:NilClass (NoMethodError)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.11.6/lib/openshift/data_store.rb:24:in `block (2 levels) in find'
	from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:286:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.11.6/lib/openshift/data_store.rb:23:in `block in find'
	from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/collection.rb:276:in `find'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.11.6/lib/openshift/data_store.rb:22:in `find'
	from /usr/sbin/oo-admin-chk:245:in `<main>'

Expected results:
should be show:
Application 'zqd' with Id '514992c6860c5d8ec5000014' doesn't have any component instances

Additional info:

Comment 1 Abhishek Gupta 2013-07-10 16:59:04 UTC
This is an invalid test case. You can empty out the component instances but the "component_instances" attribute should still be present in the application document within mongo. A bug in the code can potentially delete all component instances but the attribute still is not expected to be removed.

Comment 2 zhaozhanqi 2013-07-11 06:08:22 UTC
don't think it's an invalid test case. we are always using this way to check Group Instances without Component Instances and can get the result we expected like 'Application 'zqd' with Id '514992c6860c5d8ec5000014' doesn't have any component instances'. At least, it should not display the script error.

Comment 3 zhaozhanqi 2013-07-11 06:43:29 UTC
if that case is invalid.could you give a way to check Group Instances without Component Instances or vice versa? thanks

Comment 4 Xiaoli Tian 2013-07-11 06:53:16 UTC
(In reply to Abhishek Gupta from comment #1)
> This is an invalid test case. You can empty out the component instances but
> the "component_instances" attribute should still be present in the
> application document within mongo. A bug in the code can potentially delete
> all component instances but the attribute still is not expected to be
> removed.

This may happen when an app is partially created but not rolled back successfully  like bug https://bugzilla.redhat.com/show_bug.cgi?id=973718

Since we have improved the rollback function, not sure if it's possible to happen as well.

Comment 5 Abhishek Gupta 2013-07-11 17:20:40 UTC
To test for scenarios where group instances or component instances are missing, you can remove the individual group instances or component instances. You can even remove all the component instances or group instances from an application. 

However, you cannot remove the component_instances or group_instances attribute. They can be set to an empty array ( [] ) or an empty hash ( {} ) but the attribute itself has to be present. No bug in the code, whether partial creates, failed rollbacks, or anything else is expected to remove the attribute in the application document in mongo. That would be a much bigger issue.

Comment 6 Jianwei Hou 2013-07-12 02:38:38 UTC
Verified on devenv_3489

Remove the values of component_instances and group_instances, then 'oo-admin-chk -l 1', the problem is identified by the script, and check failed

libra_rs:PRIMARY> db.applications.update({"name":"d1"},{$set:{component_instances:[],group_instances:[]}})

libra_rs:PRIMARY> db.applications.findOne({"name":"d1"})
{
	"_id" : ObjectId("51df6a90a4cf8f7b9e0000b0"),
	"analytics" : {
		"user_agent" : "rhc/1.11.4 (ruby 1.9.3; x86_64-linux) (2.3.2, ruby 1.9.3 (2013-06-27) [x86_64-linux])"
	},
	"canonical_name" : "d1",
	"component_configure_order" : [ ],
	"component_instances" : [ ],
	"component_start_order" : [ ],
	"component_stop_order" : [ ],
	"created_at" : ISODate("2013-07-12T02:31:44.354Z"),
	"default_gear_size" : "small",
	"domain_id" : ObjectId("51df6a76a4cf8f7b9e0000a5"),
	"domain_requires" : [ ],
	"downloaded_cart_map" : {
		
	},
	"group_instances" : [ ],
	"group_overrides" : [
		{
			"components" : [
				{
					"comp" : "diy-0.1",
					"cart" : "diy-0.1"
				}
			],
			"max_gears" : 1
		}
	],
	"init_git_url" : null,
	"name" : "d1",
	"pending_op_groups" : [ ],
	"scalable" : false,
	"updated_at" : ISODate("2013-07-12T02:31:44.711Z"),
	"user_ids" : [ ],
	"uuid" : "531520479388279129505792"
}


[root@ip-10-165-39-243 ~]# oo-admin-chk -l 1
Started at: 2013-07-11 22:36:05 -0400
Time to fetch mongo data: 0.327s
Total gears found in mongo: 7
Time to get all gears from nodes: 20.97s
Total gears found on the nodes: 8
Total nodes that responded : 1
Time to get all sshkeys for all gears from nodes: 20.08s
Total gears found on the nodes: 7
Total nodes that responded : 1
Check failed.
User jhou has a mismatch in consumed gears (1) and actual gears (0)
Gear 531520479388279129505792 exists on node ip-10-165-39-243 (uid: 503) but does not exist in mongo database
Found usage record for gear Id '51df6a90a4cf8f7b9e0000b0' but could not find corresponding gear in the application.
Found usage record for gear Id '51df6b36a4cf8f33060002d2' but could not find corresponding gear in the application.
Please refer to the oo-admin-repair tool to resolve some of these inconsistencies.
Total time: 43.072s
Finished at: 2013-07-11 22:36:48 -0400