Hi Lucas, So, this kind of issues usually happens when the environment files are not added correctly, in this case in the upgrade prepare command. I just ran a minor update in my dev env (OSP13), 3 controllers, 1 compute and worked as it should. Can you share the content of your roles data file? Also, the environment files you are using and the steps you are following? My vote is that there is something duplicated in the roles data or some skipped env file in the upgrade prepare.
(In reply to Carlos Camacho from comment #17) > Hi Lucas, > Hi Carlos, > So, this kind of issues usually happens when the environment files are not > added correctly, in this case in the upgrade prepare command. Thanks for the promptly reply. > > I just ran a minor update in my dev env (OSP13), 3 controllers, 1 compute > and worked as it should. > So apparently this problem does not happen in OSP 13 (or 12) because the way the prepare containers works changed in OSP 14 (I found some context here [0] and patch [1] seems related) > Can you share the content of your roles data file? Also, the environment > files you are using and the steps you are following? > > My vote is that there is something duplicated in the roles data or some > skipped env file in the upgrade prepare. Sure, I'm debugging the issue using at the CI logs [2] because, at the moment, I don't have an environment to try it out. The content of the of the roles data can be found here: http://pastebin.test.redhat.com/672537 (You can find it at [2], undercloud-0.tar.gz, undercloud-0/home/stack/composable_roles/roles/roles_data.yaml) As you u can see there's the "OS::TripleO::Services::ContainerImagePrepare" is duplicated in the roles data. I believe that's the root problem. The environment files used in the step that is failing is here http://pastebin.test.redhat.com/672538 It's important to note that this same error is also happening in the generic update CI job as well [3] (not related to networking-ovn). I inspected the logs there [3] and I can see the exact same errors, including the duplicated "ContainerImagePrepare" item in the roles data. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1648918 [1] https://review.openstack.org/618462 [2] https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-update-14_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/13/artifact/ [3] https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/upgrades/view/update/job/DFG-upgrades-updates-14-from-2018-10-25.3-composable-ipv4/1/
Based on the IRC chat there was actually a duplication on the roles data (C17), the duplication is coming apparently from the tripleo-upgrade role.
Hi Eran, Would you mind to re-run the job testing the minor update for OVN using https://review.openstack.org/#/c/620580/ ? Or you can just apply https://review.openstack.org/#/c/620580/1/tasks/update/overcloud_update_prepare.yml That should fix the reported issue in your CI job.
Just to recap, this looks like an issue with tripleo-upgrade which is just consumed by infrared. So this should be opened in Jira then? I've been trying to run a minor upgrade for the last couple of days hitting several issues (That are already tracked) but last one is that we need a fix in tripleo-common [0]. So if you guys agree, please open a bug in Jira so that the tripleo-upgrade patch linked by Carlos on C22 is applied on infrared and leave this bug open to track [0]. Sounds reasonable Carlos/Jose Luis? [0] https://review.openstack.org/#/c/619759
amend from c23: s/minor upgrade/minor update sorry :)
Agree with Daniel. This issue is not specific to any OpenStack component, it's in a set of the ansible tasks used by infrared to perform the tests, more specifically in the upgrade/update jobs. So as Daniel mentioned, I also think this should be tracked in its corresponding tool, not in Bugzilla.
Hi Daniel, we have a Bz for tracking the BZ you just pasted. This is the BZ in question: https://bugzilla.redhat.com/show_bug.cgi?id=1652924 This BZ is for tracking the issue reported as: ... "Error: Evaluation Error: The title 'container_image_prepare' has already been used in this resource expression at /etc/puppet/modules/tripleo/manifests/firewall.pp:135:5 on node controller-2.localdomain" And looking on one controller: % grep -A5 -B5 container_image_ ... Once we have the fix in place we will move it to post.
Closing this BZ after conversation with Arie, it's CI only and the patch from tripleo-upgrade is going to be used automatically as it's already merged.
(In reply to Daniel Alvarez Sanchez from comment #27) > Closing this BZ after conversation with Arie, it's CI only and the patch > from tripleo-upgrade is going to be used automatically as it's already > merged. Sorry, it's not yet merged! I'll close it once then
*** Bug 1653622 has been marked as a duplicate of this bug. ***
@Arie was going to try this patch on a onetime job. Do you have any updates? Thanks a lot!!
@Nir, is this a release blocker in any way? I set blocker back to "?"
Hey Amit, this issue is a CI only problem due to a duplication in the tripleo-upgrade repo. The fixes are in place, so just missing to move to verify. Moving it to ON_QA until we have it verified from the networking folks.
according to this run, it looks like we have some tests that failed: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-update-14_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/37/testReport/ so I am sending it back for more investigating.
Looks like we are hitting: https://bugzilla.redhat.com/show_bug.cgi?id=1656368
The issues reported in this BZ are actually fixed, please if you hit any other issue create a BZ describing the actual error and provide logs/how to reproduce steps. @Eran this bug was reported for the tripleo-upgrade duplicated resource registry error. Once a fix is available for the any BZ, you should not move it back to ASSIGNED, if you hit another issue please raise another bug with your finding and the depends on the RHEL issue.