Bug 1020152 - [origin_broker_108]Gear endpoints are not being migrated properly by datastore migration
[origin_broker_108]Gear endpoints are not being migrated properly by datastor...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Pod (Show other bugs)
2.x
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Rajat Chopra
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-17 03:21 EDT by Jianwei Hou
Modified: 2015-05-14 20:22 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-23 22:24:44 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
mongodump before migraton on devenv-stage_486 (32.03 KB, application/gzip)
2013-10-17 03:21 EDT, Jianwei Hou
no flags Details
mongodump after compatible datastore migration (32.24 KB, application/gzip)
2013-10-17 03:22 EDT, Jianwei Hou
no flags Details
2013-10-24 mongodump after migration, 3 failed gears (38.91 KB, application/gzip)
2013-10-25 01:39 EDT, Jianwei Hou
no flags Details

  None (edit)
Description Jianwei Hou 2013-10-17 03:21:59 EDT
Created attachment 813210 [details]
mongodump before migraton on devenv-stage_486

Description of problem:
Run rhc-admin-migrate-datatore to migrate application endpoints, after migration, check the port_interfaces of scalable application, and found protocols for some cartridges are not correctly updated.


Version-Release number of selected component (if applicable):
On devenv_3907

How reproducible:
Always

Steps to Reproduce:
1. Create scalable applications with all possible cartridges on devevn-stage_486 ami(This ami does not have 'protocols' and 'type' field before migration)
2. Upgrade server to latest version as devenv_3907
3. Clear broker cache, restart broker, ruby193-mcollective
4. Do datastore migration
rhc-admin-migrate-datastore --compatible --version 2.0.35
5. Use mongo shell to verify the protocols and types endpoints are migrated properly

Actual results:
1. For db cartridges, mysql, postgresql, mongodb have 'protocols' updated as 'tcp'
2. On devenv-stage_486 ami, the group_instaces of scalable application does not have 'haproxy-1.4' cartridges, so after migration, the apps still don't have haproxy endpoints in their group_instaces 

Expected results:
1. The db cartridges 'protocols' value should be their db names, which is defined by their manifest
2. Should have haproxy-1.4 endpoints updated to the app's group_instances

Additional info:
I will attach before and after versions of mongodump
Comment 1 Jianwei Hou 2013-10-17 03:22:32 EDT
Created attachment 813211 [details]
mongodump after compatible datastore migration
Comment 2 Rajat Chopra 2013-10-22 20:36:02 EDT
Fixed with https://github.com/openshift/li/pull/2026
The haproxy of older apps needs to get its ports exposed.

Regarding database gears, check in the db gear - the mysql manifest in the gear should reflect the appropriate protocols. If not then the upgrade process needs to be run with  :ignore_cartridge_version flag. With that the manifest should get updated nicely. The reason maybe because we have not have gotten bumped up at the time of testing.
Comment 3 Jianwei Hou 2013-10-24 03:01:07 EDT
Verified on devenv_3937, the haproxy's endpoints are migrated successfully, sample:

{
							"_id" : ObjectId("5268b7c295b32b21cb000023"),
							"cartridge_name" : "haproxy-1.4",
							"external_port" : "38206",
							"internal_address" : "127.2.5.130",
							"internal_port" : 8080,
							"protocols" : [
								"http",
								"ws"
							],
							"type" : [
								"load_balancer"
							],
							"mappings" : [
								{
									"frontend" : "",
									"backend" : ""
								},
								{
									"frontend" : "/health",
									"backend" : "/configuration/health"
								}
							]
						},
Comment 4 Jianwei Hou 2013-10-24 06:45:30 EDT
Sorry, When testing this script again, if the exit_code is non-zero, then an stack trace will pop up, and the program is broken.

[root@ip-10-151-19-45 ~]# rhc-admin-migrate-datastore --compatible --version 2.0.35
/usr/bin/rhc-admin-migrate-datastore:52:in `block in expose_haproxy_port': undefined method `_id' for "5268e8c719b7dea4bd0002a5":String (NoMethodError)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:125:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:125:in `block (2 levels) in get_parallel_run_results'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:124:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:124:in `block in get_parallel_run_results'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:123:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:123:in `get_parallel_run_results'
	from /usr/bin/rhc-admin-migrate-datastore:48:in `expose_haproxy_port'
	from /usr/bin/rhc-admin-migrate-datastore:21:in `migrate'
	from /usr/bin/rhc-admin-migrate-datastore:208:in `<main>'


code block:
  RemoteJob.get_parallel_run_results(handle) { |tag, gear, stdout, exit_code|
    if exit_code!=0
      puts "Error in executing parallel job for gear #{gear._id.to_s}"
    else
      exposed += 1
    end
  }
Comment 5 Rajat Chopra 2013-10-24 19:03:59 EDT
Fixed with https://github.com/openshift/li/pull/2039
Comment 6 Jianwei Hou 2013-10-25 01:39:03 EDT
Tested on devenv_3942, this bug is fixed. Somehow, I have 3 gears failed mongo migration, after investigation, found they were 1 jbossas gear and 2 jbosseap gears. I didn't figure out why they failed, so I've attached the mongodump, can you please help confirm if this result is acceptable? Thanks

[root@ip-10-139-13-20 ~]# rhc-admin-migrate-datastore --compatible --version 2.0.35
Starting migration: compatible
Error in executing parallel job for gear 5269e21b4b4e3f9e380002a5
Error in executing parallel job for gear 5269e3064b4e3f9e38000317
Error in executing parallel job for gear 5269e7ab4b4e3f9e380005e7
Done migrating 12/15 applications.
Time to get all gears from nodes: 4.926s
Total gears found on the nodes: 47
Comment 7 Jianwei Hou 2013-10-25 01:39:45 EDT
Created attachment 816007 [details]
2013-10-24 mongodump after migration, 3 failed gears
Comment 8 Rajat Chopra 2013-10-25 10:27:38 EDT
Could you attach the broker log and mcollective logs also?
Comment 9 Rajat Chopra 2013-10-25 18:38:21 EDT
Found the issue. JBoss gears will throw this error because the number of ports exposed is already 5, and 5 is the limit. So an haproxy expose port fails there.

That should be considered expected and we will take care of these ones as a separate task.
Comment 10 Meng Bo 2013-10-28 03:47:39 EDT
Checked on devenv-stage_528 to devenv_3953, the only affected gears are all scalable jbossas and jbosseap. As the command#9, this should be expected result.

We will move the bug to verified.

@rchopra

Will the scale jboss issue be fixed in sprint35? And do we need open new bug for tracking this issue?
Comment 11 Rajat Chopra 2015-01-04 23:12:20 EST
Closing out on the 'needinfo' here. The jboss issue was fixed subsequently - the master branch already has the fixed code.

Note You need to log in before you can comment on or make changes to this bug.