Description of problem: Before the recent oo-admin-upgrade data structure refactor, oo-admin-move could be run while oo-admin-upgrade was running (as long as it ran after oo-admin-upgrade gathered it's initial data). This is important because oo-admin-upgrade runs can take over a week (with re-runs, hotfixes, etc). Now, however, that's not the case. Here's what Dan Mace said on IRC: (09/05/2013 10:32:01 AM) danmace: twiest: the new upgrader caches all that information between runs now; so that could be an issue. we can clear it prior to a followup run though I don't believe simply clearing it between runs is sufficient, though, because an oo-admin-upgrade run can take over a day to run, which is a long period of time in which oo-admin-move could be run as well. We really need both scripts to be compatible with each other. Version-Release number of selected component (if applicable): openshift-origin-broker-util-1.13.11-1.el6oso.noarch How reproducible: very Steps to Reproduce: 1. This is a race condition, so it's hard to repro 2. Run both at the same time and see if some gears are either missed or errored out Actual results: We're not able to run both oo-admin-upgrade and oo-admin-move at the same time. Expected results: Before the refacter we were able to and we need that functionality back.
Moving it to the node team to drive this fix. If any changes are required on the broker side, just let me or Rajat know.
Upon further investigation, the refactored upgrade code retains the just-in-time Mongo lookup of a gear prior to upgrading to identify its node. So, there is no regression- gears which have moved between upgrades will be reached at their new location during subsequent runs.