Created attachment 1184602 [details] Upgrade log Description of problem: During the upgrade of our Satellite instance to 6.2, the upgrade failed during the foreman migration. To work around this issue, we turned off the firewall and were able to proceed. The following is from an email exchange with Ohad: It seems that we might be able to do two things to improve things here. 1. Add a note in section 6.1.2 Upgrading Satellite Server to review the firewall configuration or simply disable it. Answer: You had a real "non-standard" iptables rules with DNAT, I don't think that many of our customers will do NAT on their satellites they way it's done currently. 2. The upgrade script could examine the state of the firewall and present an error before we proceed too deeply into the upgrade. Answer: yes - but that's really hard to know in advance, there are many edge cases - what imho we should do is improve the error messages, so it's easier to understand where the error is. Based on this exchange we should, at a minimum, have better error messages presented.
just some additional information, the migration failed when katello tried to reach out to pulp: Some backend services are not running: {:status=>"FAIL", :services=>{:pulp=>{:status=>"FAIL", :message=>"400 Bad Request"}, :pulp_auth=>{:status=>"FAIL", :message=>"Skipped pulp_auth check after failed pulp check"}, :candlepin=>{:status=>"ok", :duration_ms=>"89"}, :candlepin_auth=>{:status=>"ok", :duration_ms=>"273"}, :foreman_tasks=>{:status=>"ok", :duration_ms=>"12"}}} /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.0.68/db/migrate/20150930183738_migrate_content_hosts.rb:343:in `up'
this relates to upstream issue http://projects.theforeman.org/issues/15200
I think this is a consequence of a bit larger issue that we have in our mechanism of doing data migrations right now, described in http://projects.theforeman.org/issues/15866 I don't see an easy way how to fix this issue right now other than redesigning the way we do the data migrations and even if we do that, I don't think applying the change to existing migrations is realistic. What I would suggest is: 1. creating a new BZ representing the redesign so avoid this kind of issues http://projects.theforeman.org/issues/15866 2. closing this BZ as CANTFIX and linking it to the BZ from point 1 I can't think of any sufficient workaround right now but I'm open for suggestions if they are realistic.
Based on discussion in triage, we are going to move this bugzilla to documentation to request the following from the initial description: 1. Add a note in section 6.1.2 Upgrading Satellite Server to review and update the firewall configuration or simply disable it. In addition, we have cloned to bugzilla the upstream redmine issue referenced in comment 3. That is bug 1361676.
This content is live on the Customer Portal. Closing.