Bug 1020152 - [origin_broker_108]Gear endpoints are not being migrated properly by datastore migration
[origin_broker_108]Gear endpoints are not being migrated properly by datastor...
Product: OpenShift Online
Classification: Red Hat
Component: Pod (Show other bugs)
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Rajat Chopra
libra bugs
Depends On:
  Show dependency treegraph
Reported: 2013-10-17 03:21 EDT by Jianwei Hou
Modified: 2015-05-14 20:22 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2014-01-23 22:24:44 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
mongodump before migraton on devenv-stage_486 (32.03 KB, application/gzip)
2013-10-17 03:21 EDT, Jianwei Hou
no flags Details
mongodump after compatible datastore migration (32.24 KB, application/gzip)
2013-10-17 03:22 EDT, Jianwei Hou
no flags Details
2013-10-24 mongodump after migration, 3 failed gears (38.91 KB, application/gzip)
2013-10-25 01:39 EDT, Jianwei Hou
no flags Details

  None (edit)
Description Jianwei Hou 2013-10-17 03:21:59 EDT
Created attachment 813210 [details]
mongodump before migraton on devenv-stage_486

Description of problem:
Run rhc-admin-migrate-datatore to migrate application endpoints, after migration, check the port_interfaces of scalable application, and found protocols for some cartridges are not correctly updated.

Version-Release number of selected component (if applicable):
On devenv_3907

How reproducible:

Steps to Reproduce:
1. Create scalable applications with all possible cartridges on devevn-stage_486 ami(This ami does not have 'protocols' and 'type' field before migration)
2. Upgrade server to latest version as devenv_3907
3. Clear broker cache, restart broker, ruby193-mcollective
4. Do datastore migration
rhc-admin-migrate-datastore --compatible --version 2.0.35
5. Use mongo shell to verify the protocols and types endpoints are migrated properly

Actual results:
1. For db cartridges, mysql, postgresql, mongodb have 'protocols' updated as 'tcp'
2. On devenv-stage_486 ami, the group_instaces of scalable application does not have 'haproxy-1.4' cartridges, so after migration, the apps still don't have haproxy endpoints in their group_instaces 

Expected results:
1. The db cartridges 'protocols' value should be their db names, which is defined by their manifest
2. Should have haproxy-1.4 endpoints updated to the app's group_instances

Additional info:
I will attach before and after versions of mongodump
Comment 1 Jianwei Hou 2013-10-17 03:22:32 EDT
Created attachment 813211 [details]
mongodump after compatible datastore migration
Comment 2 Rajat Chopra 2013-10-22 20:36:02 EDT
Fixed with https://github.com/openshift/li/pull/2026
The haproxy of older apps needs to get its ports exposed.

Regarding database gears, check in the db gear - the mysql manifest in the gear should reflect the appropriate protocols. If not then the upgrade process needs to be run with  :ignore_cartridge_version flag. With that the manifest should get updated nicely. The reason maybe because we have not have gotten bumped up at the time of testing.
Comment 3 Jianwei Hou 2013-10-24 03:01:07 EDT
Verified on devenv_3937, the haproxy's endpoints are migrated successfully, sample:

							"_id" : ObjectId("5268b7c295b32b21cb000023"),
							"cartridge_name" : "haproxy-1.4",
							"external_port" : "38206",
							"internal_address" : "",
							"internal_port" : 8080,
							"protocols" : [
							"type" : [
							"mappings" : [
									"frontend" : "",
									"backend" : ""
									"frontend" : "/health",
									"backend" : "/configuration/health"
Comment 4 Jianwei Hou 2013-10-24 06:45:30 EDT
Sorry, When testing this script again, if the exit_code is non-zero, then an stack trace will pop up, and the program is broken.

[root@ip-10-151-19-45 ~]# rhc-admin-migrate-datastore --compatible --version 2.0.35
/usr/bin/rhc-admin-migrate-datastore:52:in `block in expose_haproxy_port': undefined method `_id' for "5268e8c719b7dea4bd0002a5":String (NoMethodError)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:125:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:125:in `block (2 levels) in get_parallel_run_results'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:124:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:124:in `block in get_parallel_run_results'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:123:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.16.3/app/models/remote_job.rb:123:in `get_parallel_run_results'
	from /usr/bin/rhc-admin-migrate-datastore:48:in `expose_haproxy_port'
	from /usr/bin/rhc-admin-migrate-datastore:21:in `migrate'
	from /usr/bin/rhc-admin-migrate-datastore:208:in `<main>'

code block:
  RemoteJob.get_parallel_run_results(handle) { |tag, gear, stdout, exit_code|
    if exit_code!=0
      puts "Error in executing parallel job for gear #{gear._id.to_s}"
      exposed += 1
Comment 5 Rajat Chopra 2013-10-24 19:03:59 EDT
Fixed with https://github.com/openshift/li/pull/2039
Comment 6 Jianwei Hou 2013-10-25 01:39:03 EDT
Tested on devenv_3942, this bug is fixed. Somehow, I have 3 gears failed mongo migration, after investigation, found they were 1 jbossas gear and 2 jbosseap gears. I didn't figure out why they failed, so I've attached the mongodump, can you please help confirm if this result is acceptable? Thanks

[root@ip-10-139-13-20 ~]# rhc-admin-migrate-datastore --compatible --version 2.0.35
Starting migration: compatible
Error in executing parallel job for gear 5269e21b4b4e3f9e380002a5
Error in executing parallel job for gear 5269e3064b4e3f9e38000317
Error in executing parallel job for gear 5269e7ab4b4e3f9e380005e7
Done migrating 12/15 applications.
Time to get all gears from nodes: 4.926s
Total gears found on the nodes: 47
Comment 7 Jianwei Hou 2013-10-25 01:39:45 EDT
Created attachment 816007 [details]
2013-10-24 mongodump after migration, 3 failed gears
Comment 8 Rajat Chopra 2013-10-25 10:27:38 EDT
Could you attach the broker log and mcollective logs also?
Comment 9 Rajat Chopra 2013-10-25 18:38:21 EDT
Found the issue. JBoss gears will throw this error because the number of ports exposed is already 5, and 5 is the limit. So an haproxy expose port fails there.

That should be considered expected and we will take care of these ones as a separate task.
Comment 10 Meng Bo 2013-10-28 03:47:39 EDT
Checked on devenv-stage_528 to devenv_3953, the only affected gears are all scalable jbossas and jbosseap. As the command#9, this should be expected result.

We will move the bug to verified.


Will the scale jboss issue be fixed in sprint35? And do we need open new bug for tracking this issue?
Comment 11 Rajat Chopra 2015-01-04 23:12:20 EST
Closing out on the 'needinfo' here. The jboss issue was fixed subsequently - the master branch already has the fixed code.

Note You need to log in before you can comment on or make changes to this bug.