Bug 1920293 - Reboot fails after leapp
Summary: Reboot fails after leapp
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: All
OS: All
medium
medium
Target Milestone: z6
: 16.1 (Train on RHEL 8.2)
Assignee: Sergii Golovatiuk
QA Contact: Jose Luis Franco
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-26 00:58 UTC by Sergii Golovatiuk
Modified: 2022-08-02 14:01 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210310113344.29a02c1.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-26 13:50:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 774214 0 None MERGED Add post delay to reboot 2021-02-18 10:38:51 UTC
Red Hat Issue Tracker OSP-694 0 None None None 2022-08-02 14:01:26 UTC
Red Hat Product Errata RHBA-2021:2097 0 None None None 2021-05-26 13:51:10 UTC

Description Sergii Golovatiuk 2021-01-26 00:58:07 UTC
Description of problem:

Reboot by ansible to perform system update (leapp) fails in CI



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Run https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/
2. Repeat until it fails
3.

Actual results:

https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/46/artifact/.sh/ir-tripleo-ffu-upgrade-run.log


non-zero return code
...ignoring

TASK [tripleo-upgrade : .] ***
task path: /home/rhos-ci/jenkins/workspace/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/infrared/plugins/tripleo-upgrade/infrared_plugin/roles/tripleo-upgrade/tasks/fast-forward-upgrade/overcloud_upgrade_hosts.yaml:84
Friday 22 January 2021  19:29:44 +0000 (0:05:05.694)       5:30:01.505 ******** 
fatal: [undercloud-0]: FAILED! => {
    "changed": false
}

MSG:

Overcloud upgrade composable step failed for compute-1... :(

2021-01-22 14:24:37 | Friday 22 January 2021  14:21:47 -0500 (0:09:32.060)       0:11:56.470 ******** 
2021-01-22 14:24:37 | changed: [compute-1] => {"changed": true, "elapsed": 2, "rebooted": true}
2021-01-22 14:24:37 | 
2021-01-22 14:24:37 | TASK [Force facts refresh after OS upgrade to refresh cache.] ******************
2021-01-22 14:24:37 | Friday 22 January 2021  14:21:50 -0500 (0:00:03.302)       0:11:59.773 ******** 
2021-01-22 14:24:37 | fatal: [compute-1]: UNREACHABLE! => {"changed": false, "msg": "Data could not be sent to remote host \"192.168.24.17\". Make sure this host can be reached over ssh: ssh: connect to host 192.168.24.17 port 22: No route to host\r\n", "unreachable": true}


Expected results:
Reboot happened and facts are cleared and gathered.



Additional info:

This code needs tweaking as in some cases network is lost as leapp finishes configure networking using systemd after RHEL7>RHEL8 upgrade
----%<----
- name: reboot to perform the upgrade
  reboot:
    reboot_timeout: "{{upgrade_leapp_reboot_timeout}}"
- name: Clear gathered facts from all currently targeted hosts
  meta: clear_facts
- name: Force facts refresh after OS upgrade to refresh cache.
  setup:
----%<----

Comment 8 Jose Luis Franco 2021-05-05 14:40:29 UTC
Verifying with CI job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/

Checked that the code is present in /var/lib/mistral:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/var/lib/mistral/c4d34352-464c-462c-80c9-680caa64708b/Controller/upgrade_tasks_step4.yaml.gz
  - name: reboot to perform the upgrade
    reboot:
      post_reboot_delay: '{{ upgrade_leapp_post_reboot_delay }}'
      reboot_timeout: '{{upgrade_leapp_reboot_timeout}}'
      test_command: systemctl is-system-running | grep -e running -e degraded


Task present in log:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/home/stack/overcloud_system_upgrade-controller-0.log.gz

2021-05-04 22:51:39 | TASK [Copy /boot/grub2/grubenv to /boot/efi/EFI/redhat/grubenv] ****************
2021-05-04 22:51:39 | Tuesday 04 May 2021  22:36:08 +0000 (0:00:00.074)       0:13:45.017 *********** 
2021-05-04 22:51:39 | skipping: [controller-0] => {"changed": false, "skip_reason": "Conditional result was False"}
2021-05-04 22:51:39 | 
2021-05-04 22:51:39 | TASK [reboot to perform the upgrade] *******************************************
2021-05-04 22:51:39 | Tuesday 04 May 2021  22:36:08 +0000 (0:00:00.076)       0:13:45.093 *********** 
2021-05-04 22:51:39 | changed: [controller-0] => {"changed": true, "elapsed": 927, "rebooted": true}
2021-05-04 22:51:39 | 
2021-05-04 22:51:39 | 
2021-05-04 22:51:39 | TASK [Set the python to python3] ***********************************************
2021-05-04 22:51:39 | Tuesday 04 May 2021  22:51:36 +0000 (0:15:28.358)       0:29:13.451 *********** 
2021-05-04 22:51:39 | changed: [controller-0] => {"changed": true, "cmd": "alternatives --set python /usr/bin/python3", "delta": "0:00:00.007189", "end": "2021-05-04 22:51:37.230783", "rc": 0, "start": "2021-05-04 22:51:37.223594", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2021-05-04 22:51:39 | 

CI job successfully passed.

PACKAGES:
http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv4-vxlan-HA-no-ceph-from-passed_phase2/80/undercloud-0/var/log/dnf.rpm.log.gz

2021-05-04T21:23:19Z SUBDEBUG Installed: openstack-tripleo-heat-templates-11.3.2-1.20210408163452.el8ost.noarch

Comment 14 errata-xmlrpc 2021-05-26 13:50:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097


Note You need to log in before you can comment on or make changes to this bug.