Description of problem: Reboot by ansible to perform system update (leapp) fails in CI Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Run https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/ 2. Repeat until it fails 3. Actual results: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/46/artifact/.sh/ir-tripleo-ffu-upgrade-run.log non-zero return code ...ignoring TASK [tripleo-upgrade : .] *** task path: /home/rhos-ci/jenkins/workspace/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/infrared/plugins/tripleo-upgrade/infrared_plugin/roles/tripleo-upgrade/tasks/fast-forward-upgrade/overcloud_upgrade_hosts.yaml:84 Friday 22 January 2021 19:29:44 +0000 (0:05:05.694) 5:30:01.505 ******** fatal: [undercloud-0]: FAILED! => { "changed": false } MSG: Overcloud upgrade composable step failed for compute-1... :( 2021-01-22 14:24:37 | Friday 22 January 2021 14:21:47 -0500 (0:09:32.060) 0:11:56.470 ******** 2021-01-22 14:24:37 | changed: [compute-1] => {"changed": true, "elapsed": 2, "rebooted": true} 2021-01-22 14:24:37 | 2021-01-22 14:24:37 | TASK [Force facts refresh after OS upgrade to refresh cache.] ****************** 2021-01-22 14:24:37 | Friday 22 January 2021 14:21:50 -0500 (0:00:03.302) 0:11:59.773 ******** 2021-01-22 14:24:37 | fatal: [compute-1]: UNREACHABLE! => {"changed": false, "msg": "Data could not be sent to remote host \"192.168.24.17\". Make sure this host can be reached over ssh: ssh: connect to host 192.168.24.17 port 22: No route to host\r\n", "unreachable": true} Expected results: Reboot happened and facts are cleared and gathered. Additional info: This code needs tweaking as in some cases network is lost as leapp finishes configure networking using systemd after RHEL7>RHEL8 upgrade ----%<---- - name: reboot to perform the upgrade reboot: reboot_timeout: "{{upgrade_leapp_reboot_timeout}}" - name: Clear gathered facts from all currently targeted hosts meta: clear_facts - name: Force facts refresh after OS upgrade to refresh cache. setup: ----%<----
Verifying with CI job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/ Checked that the code is present in /var/lib/mistral: http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/var/lib/mistral/c4d34352-464c-462c-80c9-680caa64708b/Controller/upgrade_tasks_step4.yaml.gz - name: reboot to perform the upgrade reboot: post_reboot_delay: '{{ upgrade_leapp_post_reboot_delay }}' reboot_timeout: '{{upgrade_leapp_reboot_timeout}}' test_command: systemctl is-system-running | grep -e running -e degraded Task present in log: http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/home/stack/overcloud_system_upgrade-controller-0.log.gz 2021-05-04 22:51:39 | TASK [Copy /boot/grub2/grubenv to /boot/efi/EFI/redhat/grubenv] **************** 2021-05-04 22:51:39 | Tuesday 04 May 2021 22:36:08 +0000 (0:00:00.074) 0:13:45.017 *********** 2021-05-04 22:51:39 | skipping: [controller-0] => {"changed": false, "skip_reason": "Conditional result was False"} 2021-05-04 22:51:39 | 2021-05-04 22:51:39 | TASK [reboot to perform the upgrade] ******************************************* 2021-05-04 22:51:39 | Tuesday 04 May 2021 22:36:08 +0000 (0:00:00.076) 0:13:45.093 *********** 2021-05-04 22:51:39 | changed: [controller-0] => {"changed": true, "elapsed": 927, "rebooted": true} 2021-05-04 22:51:39 | 2021-05-04 22:51:39 | 2021-05-04 22:51:39 | TASK [Set the python to python3] *********************************************** 2021-05-04 22:51:39 | Tuesday 04 May 2021 22:51:36 +0000 (0:15:28.358) 0:29:13.451 *********** 2021-05-04 22:51:39 | changed: [controller-0] => {"changed": true, "cmd": "alternatives --set python /usr/bin/python3", "delta": "0:00:00.007189", "end": "2021-05-04 22:51:37.230783", "rc": 0, "start": "2021-05-04 22:51:37.223594", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} 2021-05-04 22:51:39 | CI job successfully passed. PACKAGES: http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/var/log/dnf.rpm.log.gz 2021-05-04T21:23:19Z SUBDEBUG Installed: openstack-tripleo-heat-templates-11.3.2-1.20210408163452.el8ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2097